前段時間有位朋友微信找到我,說他生產機器上的 Console 服務看起來像是卡死了,也不生成紀錄檔,對方也收不到我的httpclient請求,不知道程式出現什麼情況了,特來尋求幫助。
哈哈,一般來說卡死的情況在表單程式(WinForm,WPF) 上特別多,在 Console,WebApi 中相對較少,既然找到我,那就上 WinDbg 分析吧。
程式之所以能跑的梭梭響,全靠執行緒幫忙,如果玩不轉可能就是執行緒上出了點什麼問題,接下來使用 !t
展示下執行緒列表。
0:000> !t
ThreadCount: 124
UnstartedThread: 0
BackgroundThread: 105
PendingThread: 0
DeadThread: 18
Hosted Runtime: no
Lock
DBG ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 65ec 000002097A4BF390 2a020 Preemptive 0000000000000000:0000000000000000 000002097a4bea20 -00001 MTA
3 2 15afc 000002097A50BA10 2b220 Preemptive 00000209060BE8F8:00000209060BEBE8 000002097a4bea20 -00001 MTA (Finalizer)
4 3 11d20 000002097A524EC0 102a220 Preemptive 0000000000000000:0000000000000000 000002097a4bea20 -00001 MTA (Threadpool Worker)
5 9 b8a4 000002097DE10290 3029220 Preemptive 0000000000000000:0000000000000000 000002097a4bea20 -00001 MTA (Threadpool Worker)
6 13 1b22c 000002097DE0ADB0 1029220 Preemptive 0000000000000000:0000000000000000 000002097a4bea20 -00001 MTA (Threadpool Worker)
7 16 1b0e8 000002097DE1A030 202b220 Preemptive 00000209063CB630:00000209063CC1F8 000002097a4bea20 -00001 MTA
8 17 138c8 000002097DE175C0 202b220 Preemptive 00000209063F71C0:00000209063F7BC8 000002097a4bea20 -00001 MTA
9 18 1afd0 000002097DE181E0 202b220 Preemptive 00000209064627E0:0000020906463598 000002097a4bea20 -00001 MTA
10 19 1ac48 000002097DE13310 202b220 Preemptive 000002090632D6C8:000002090632E0D8 000002097a4bea20 -00001 MTA
11 20 18704 000002097DE16390 202b220 Preemptive 00000209063FB5A8:00000209063FBBC8 000002097a4bea20 -00001 MTA
12 21 1ade4 000002097DE187F0 202b220 Preemptive 00000209062EA138:00000209062EA708 000002097a4bea20 -00001 MTA
13 22 13164 000002097DE13920 202b220 Preemptive 0000020906392108:0000020906392A38 000002097a4bea20 -00001 MTA
14 23 1b334 000002097DE169A0 202b220 Preemptive 00000209063CD9E8:00000209063CE1F8 000002097a4bea20 -00001 MTA
...
106 168 19e18 0000020927066770 a029220 Preemptive 0000000000000000:0000000000000000 000002097a4bea20 -00001 MTA (Threadpool Completion Port)
108 136 af74 0000020928590290 8029220 Preemptive 00000209063B60E0:00000209063B6408 000002097a4bea20 -00001 MTA (Threadpool Completion Port)
107 84 e7f0 000002097AD6B600 8029220 Preemptive 00000209063E3300:00000209063E3DD8 000002097a4bea20 -00001 MTA (Threadpool Completion Port)
一般來說卦中的 Lock Count
列表示當前執行緒所持有的託管鎖個數,現在顯示的 -00001
應該是命令不起效果了。。。不過沒關係,我們還可以通過 !syncblk
來看下 lock 的情況,畢竟 95%
的鎖場景都會用到它。
0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
176 000002097E0DA1D8 17 1 00000209216A3500 132c8 71 00000209002c6a68 System.Object
191 000002097E0DAC88 27 1 000002097ADC54F0 1ae90 53 00000209000185e8 System.IO.TextWriter+SyncTextWriter
-----------------------------
Total 270
CCW 0
RCW 0
ComClassFactory 0
Free 32
從卦中的 MonitorHeld=27
來看,表示這個 SyncTextWriter 物件當前有 13 個執行緒在等待,有 1 個執行緒在持有,那這個執行緒為什麼沒有退出呢? 接下來可以切到 53
號執行緒上,檢視下它的執行緒棧。
0:053> ~~[1ae90]s
ntdll!NtWriteFile+0x14:
00007ffd`a70df774 c3 ret
0:053> !clrstack
OS Thread Id: 0x1ae90 (53)
Child SP IP Call Site
00000070505BDE88 00007ffda70df774 [InlinedCallFrame: 00000070505bde88] Interop+Kernel32.WriteFile(IntPtr, Byte*, Int32, Int32 ByRef, IntPtr)
00000070505BDE88 00007ffd32a7cbe2 [InlinedCallFrame: 00000070505bde88] Interop+Kernel32.WriteFile(IntPtr, Byte*, Int32, Int32 ByRef, IntPtr)
00000070505BDE50 00007ffd32a7cbe2 ILStubClass.IL_STUB_PInvoke(IntPtr, Byte*, Int32, Int32 ByRef, IntPtr)
00000070505BDF20 00007ffd3a2d6971 System.ConsolePal+WindowsConsoleStream.WriteFileNative(IntPtr, Byte[], Int32, Int32, Boolean)
00000070505BDF80 00007ffd3a2d672f System.ConsolePal+WindowsConsoleStream.Write(Byte[], Int32, Int32) [/_/src/libraries/System.Console/src/System/ConsolePal.Windows.cs @ 1131]
00000070505BDFD0 00007ffd377c05d9 System.IO.StreamWriter.Flush(Boolean, Boolean) [/_/src/libraries/System.Private.CoreLib/src/System/IO/StreamWriter.cs @ 260]
00000070505BE050 00007ffd3a2d6687 System.IO.StreamWriter.WriteLine(System.String)
00000070505BE0D0 00007ffd3a2d6472 System.IO.TextWriter+SyncTextWriter.WriteLine(System.String) [/_/src/libraries/System.Private.CoreLib/src/System/IO/TextWriter.cs @ 880]
00000070505BE130 00007ffd3a2d640d System.Console.WriteLine(System.String) [/_/src/libraries/System.Console/src/System/Console.cs @ 716]
00000070505BE160 00007ffd3a2375ab HCloud.xxx+d__2.MoveNext()
00000070505BEAE0 00007ffd37e19365 System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.__Canon, System.Private.CoreLib]](System.__Canon ByRef) [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs @ 63]
00000070505BEB40 00007ffd3a235cc3 HCloud.xxxx.Execute(Quartz.IJobExecutionContext)
00000070505BEBB0 00007ffd3a231264 Quartz.xxx+d__9.MoveNext()
00000070505BEE90 00007ffd3a230183 System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Quartz.Core.JobRunShell+d__9, Quartz]](d__9 ByRef) [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs @ 63]
00000070505BEF10 00007ffd3a2300db System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Quartz.Core.JobRunShell+d__9, Quartz]](d__9 ByRef) [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilder.cs @ 33]
00000070505BEF40 00007ffd3a230082 Quartz.xxxxl.Run(System.Threading.CancellationToken)
00000070505BF020 00007ffd3a22ffc3 Quartz.Core.QuartzSchedulerThread+c__DisplayClass28_0.b__0()
00000070505BF060 00007ffd3939b71a System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib]].InnerInvoke() [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs @ 507]
00000070505BF0C0 00007ffd37d54431 System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 300]
00000070505BF110 00007ffd37d53657 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread) [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 2352]
00000070505BF1C0 00007ffd37d50e04 System.Threading.ThreadPoolWorkQueue.Dispatch() [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPool.cs @ 677]
00000070505BF690 00007ffd925ea7a3 [DebuggerU2MCatchHandlerFrame: 00000070505bf690]
仔細觀察執行緒棧會很驚訝的發現,居然還能卡在 System.Console.WriteLine
方法上,挺奇怪的,為了探究原因,我們使用 k
命令看下非託管棧。
0:053> k 5
# Child-SP RetAddr Call Site
00 00000070`505bddd8 00007ffd`a32febda ntdll!NtWriteFile+0x14
01 00000070`505bdde0 00007ffd`32a7cbe2 KERNELBASE!WriteFile+0x7a
02 00000070`505bde50 00007ffd`3a2d6971 0x00007ffd`32a7cbe2
03 00000070`505bdf20 00007ffd`3a2d672f System_Console!System.ConsolePal.WindowsConsoleStream.WriteFileNative+0x61
04 00000070`505bdf80 00007ffd`377c05d9 System_Console!System.ConsolePal.WindowsConsoleStream.Write+0x3f
0:053> ub ntdll!NtWriteFile+0x14
ntdll!NtDeviceIoControlFile+0x15:
00007ffd`a70df755 cd2e int 2Eh
00007ffd`a70df757 c3 ret
00007ffd`a70df758 0f1f840000000000 nop dword ptr [rax+rax]
ntdll!NtWriteFile:
00007ffd`a70df760 4c8bd1 mov r10,rcx
00007ffd`a70df763 b808000000 mov eax,8
00007ffd`a70df768 f604250803fe7f01 test byte ptr [SharedUserData+0x308 (00000000`7ffe0308)],1
00007ffd`a70df770 7503 jne ntdll!NtWriteFile+0x15 (00007ffd`a70df775)
00007ffd`a70df772 0f05 syscall
從上面的 syscall
系統呼叫關鍵詞看,程式碼是將使用者態的 ntdll!NtWriteFile
切到入了核心態的 nt!NtWriteFile
方法,那進入了核心態為什麼沒有返回呢? 這又是一個值得思索的問題。
其實 ntdll!NtWriteFile
這個 win32 api 方法的第一個引數是一個 handle 的檔案控制程式碼,簽名如下。
__kernel_entry NTSYSCALLAPI NTSTATUS NtWriteFile(
[in] HANDLE FileHandle,
[in, optional] HANDLE Event,
[in, optional] PIO_APC_ROUTINE ApcRoutine,
[in, optional] PVOID ApcContext,
[out] PIO_STATUS_BLOCK IoStatusBlock,
[in] PVOID Buffer,
[in] ULONG Length,
[in, optional] PLARGE_INTEGER ByteOffset,
[in, optional] PULONG Key
);
可能 handle 在核心中被別人佔用了,可以用 !handle
檢視下 rcx
暫存器。
0:053> r
rax=0000000000000008 rbx=00000070505bdf50 rcx=0000000000000418
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000418
rip=00007ffda70df774 rsp=00000070505bddd8 rbp=00000070505bdf10
r8=0000000002000805 r9=0000000000000000 r10=0000020906191620
r11=00000070505bc8f8 r12=0000000000000100 r13=0000000000000053
r14=0000000000000077 r15=000002097adc54f0
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
ntdll!NtWriteFile+0x14:
00007ffd`a70df774 c3 ret
0:053> !handle 0000000000000418 f
Handle 0000000000000418
Type File
Attributes 0
GrantedAccess 0x120196:
ReadControl,Synch
Write/Add,Append/SubDir/CreatePipe,WriteEA,ReadAttr,WriteAttr
HandleCount 3
PointerCount 65483
No object specific information available
哈哈,其實也看不出什麼,也沒法進入核心態,所以下一步只能到網上搜搜看,其實有經驗的朋友肯定猜出來了,應該是控制檯啟用了 快捷編輯視窗
,截圖如下:
快捷編輯視窗
簡而言之就是使用者可以在控制檯上獨佔這個視窗,編輯一些內容, 可一旦被使用者獨佔,那程式側就沒法輸出內容到 控制檯視窗
上了,只能在 核心態
傻傻等等,這應該就是形成原因,畫個圖大概就像下面這樣。
將資訊告訴朋友後,朋友說他用的是 Windows 服務部署
,但不管是什麼模式部署,註釋掉 Console.WriteLine 肯定沒錯。
感謝如家的 鄧工 提供的禁用
快捷編輯視窗
程式碼。
#region 關閉控制檯 快速編輯模式、插入模式
const int STD_INPUT_HANDLE = -10;
const uint ENABLE_QUICK_EDIT_MODE = 0x0040;
const uint ENABLE_INSERT_MODE = 0x0020;
[DllImport("kernel32.dll", SetLastError = true)]
internal static extern IntPtr GetStdHandle(int hConsoleHandle);
[DllImport("kernel32.dll", SetLastError = true)]
internal static extern bool GetConsoleMode(IntPtr hConsoleHandle, out uint mode);
[DllImport("kernel32.dll", SetLastError = true)]
internal static extern bool SetConsoleMode(IntPtr hConsoleHandle, uint mode);
public static void DisbleQuickEditMode()
{
IntPtr hStdin = GetStdHandle(STD_INPUT_HANDLE);
uint mode;
GetConsoleMode(hStdin, out mode);
mode &= ~ENABLE_QUICK_EDIT_MODE;//移除快速編輯模式
mode &= ~ENABLE_INSERT_MODE; //移除插入模式
SetConsoleMode(hStdin, mode);
}
#endregion
這次卡死的事故,主要還是開發人員大量的使用 Console.WriteLine 來輸出紀錄檔,在某個時刻輸出端視窗因為各種原因被他人獨佔,導致程式側無法輸出內容到視窗而一直被迫等待,之後朋友將紀錄檔輸出切換到檔案模式,問題得以解決。
其實這個問題很多新手朋友都會犯,特此記錄下來。