Docker 中的 .NET 異常了怎麼抓 Dump

2023-06-26 15:00:41

一:背景

1. 講故事

有很多朋友跟我說,在 Windows 上看過你文章知道了怎麼抓 Crash, CPU爆高,記憶體暴漲 等各種Dump,為什麼你沒有寫在 Docker 中如何抓的相關文章呢?瞧不上嗎?

哈哈,在DUMP的分析旅程中,跑在 Docker 中的 .NET 佔比真的不多,大概10個dump有 1-2 個是 docker 中的,市場決定了我的研究方向,為了彌補這一塊的空洞,決定寫一篇文章來分享下這三大異常下的捕獲吧。

二:Docker 下的三大異常捕獲

1. crash dump 捕獲

前不久我寫了一篇 Linux 上的 .NET 崩潰了怎麼抓 Dump (https://www.cnblogs.com/huangxincheng/p/17440153.html) 的文章,使用了微軟推薦的環境變數方式,其實這在 Docker 中是一樣適用的。

為了讓 webapi 崩潰退出,我故意造一個棧溢位異常,參考程式碼如下:


    public class Program
    {
        public static void Main(string[] args)
        {
            var builder = WebApplication.CreateBuilder(args);
            builder.Services.AddAuthorization();
            var app = builder.Build();
            app.UseAuthorization();

            //1. crash
            Task.Factory.StartNew(() =>
            {
                Test("a");
            });

            app.Run();
        }

        public static string Test(string a)
        {
            return Test("a" + a.Length);
        }
    }

有了程式碼之後,接下來寫一個 Dockerfile,主要就是把三個環境變數塞進去。


FROM mcr.microsoft.com/dotnet/aspnet:6.0 AS runtime
WORKDIR /app
COPY ./ ./

# 1. 使用中科大映象源
RUN sed -i 's/deb.debian.org/mirrors.ustc.edu.cn/g' /etc/apt/sources.list

ENV COMPlus_DbgMiniDumpType 4
ENV COMPlus_DbgMiniDumpName /dumps/%p-%e-%h-%t.dmp
ENV COMPlus_DbgEnableMiniDump 1

ENTRYPOINT ["dotnet", "AspNetWebApi.dll"]

這裡有一個細節,為了能夠讓 Docker 中的 webapi 能夠存取到,將 localhost 設定為 * ,修改 appsettings.json 如下:


{
  "urls": "http://*:5001",
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "AllowedHosts": "*"
}

有了這些基礎最後就是 docker build & docker run 啦。


[root@localhost data]# docker build -t aspnetapp .
[+] Building 0.3s (9/9) FINISHED                                                                         
 => [internal] load build definition from Dockerfile                                                0.0s
 => => transferring dockerfile: 447B                                                                0.0s
 => [internal] load .dockerignore                                                                   0.0s
 => => transferring context: 2B                                                                     0.0s
 => [internal] load metadata for mcr.microsoft.com/dotnet/aspnet:6.0                                0.3s
 => [1/4] FROM mcr.microsoft.com/dotnet/aspnet:6.0@sha256:a2a04325fdb2a871e964c89318921f82f6435b54  0.0s
 => [internal] load build context                                                                   0.0s
 => => transferring context: 860B                                                                   0.0s
 => CACHED [2/4] WORKDIR /app                                                                       0.0s
 => CACHED [3/4] COPY ./ ./                                                                         0.0s
 => CACHED [4/4] RUN sed -i 's/deb.debian.org/mirrors.ustc.edu.cn/g' /etc/apt/sources.list          0.0s
 => exporting to image                                                                              0.0s
 => => exporting layers                                                                             0.0s
 => => writing image sha256:be69203995c0e5423b2af913549e618d7ee8306fff3961118ff403b1359ae571        0.0s
 => => naming to docker.io/library/aspnetapp                                                        0.0s

[root@localhost data]# docker run -itd  -p 5001:5001 --privileged -v /data2:/dumps --name aspnetcore_sample aspnetapp
ca34c9274d998096f8562cbef3a43a7cbd9aa5ff2923e0f3e702b159e0b2f447

[root@localhost data]# docker ps -a
CONTAINER ID   IMAGE       COMMAND                  CREATED          STATUS                       PORTS     NAMES
ca34c9274d99   aspnetapp   "dotnet AspNetWebApi…"   20 seconds ago   Exited (139) 9 seconds ago             aspnetcore_sample

[root@localhost data]# docker logs ca34c9274d99
   ...
   at AspNetWebApi.Program.Test(System.String)
   at AspNetWebApi.Program.Test(System.String)
   at AspNetWebApi.Program.Test(System.String)
   at AspNetWebApi.Program.Test(System.String)
   at AspNetWebApi.Program+<>c.<Main>b__0_0()
   at System.Threading.Tasks.Task.InnerInvoke()
   at System.Threading.Tasks.Task+<>c.<.cctor>b__272_0(System.Object)
   at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)
   at System.Threading.Tasks.Task.ExecuteEntryUnsafe(System.Threading.Thread)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 0017 signal 6 (0006)
[createdump] Writing full dump to file /dumps/1-dotnet-ca34c9274d99-1687746929.dmp
[createdump] Written 261320704 bytes (63799 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written

[root@localhost data2]# cd /data2
[root@localhost data2]# ls -ln
total 255288
-rw-------. 1 0 0 261414912 Jun 26 10:35 1-dotnet-ca34c9274d99-1687746929.dmp

上面的指令碼已經寫的非常清楚了,這裡有幾個注意點提一下:

  • --privileged

一定要加上特殊許可權,否則生成 dump 的時候會提示無許可權。

  • -v /data2:/dumps

防止dump丟失,記得掛載到宿主機目錄 或者 共用容器 中。

2. 記憶體暴漲 dump 捕獲

要想對 docker 中的 .NET 程式記憶體 進行監控,我一直都是極力推薦 procdump,目前最新的是版本是 1.5, github官網地址: https://github.com/Sysinternals/ProcDump-for-Linux 鑑於現在存取 github 太慢,大家可以把 procdump_1.5-16239_amd64.deb 下載到本地,為什麼下載它,是因為容器中是 debain 系統。

下載好了之後放到專案中,使用預設程式碼骨架:


    public class Program
    {
        public static void Main(string[] args)
        {
            var builder = WebApplication.CreateBuilder(args);
            builder.Services.AddAuthorization();
            var app = builder.Build();
            app.UseAuthorization();

            app.Run();
        }
    }

接下來就是寫 dockerfile 了,這裡有一個細節,就是如何在 Docker 中開啟多程序,這裡用 start.sh 指令碼的方式開啟,參考程式碼如下:


FROM mcr.microsoft.com/dotnet/aspnet:6.0 AS runtime
WORKDIR /app
COPY ./ ./

# 1. 使用中科大映象源
RUN sed -i 's/deb.debian.org/mirrors.ustc.edu.cn/g' /etc/apt/sources.list

# 2. 安裝 gdb & procdump
RUN apt-get update && apt-get install -y gdb
RUN dpkg -i procdump.deb

RUN echo "#!/bin/bash \n\
procdump -m 30 -w dotnet /dumps & \n\
dotnet \$1 \n\
" > ./start.sh

RUN chmod +x ./start.sh

ENTRYPOINT ["./start.sh", "AspNetWebApi.dll"]

有了這些設定後,接下來就是 publish 程式碼用 docker 構建啦,為了方便演示,這裡就用 前臺模式 開啟了哈。


[root@localhost data]# docker build -t aspnetapp .
[+] Building 11.5s (13/13) FINISHED              

[root@localhost data]# docker rm -f aspnetcore_sample
aspnetcore_sample
[root@localhost data]# docker run -it --rm  -p 5001:5001 --privileged -v /data2:/dumps --name aspnetcore_sample aspnetapp
ProcDump v1.5 - Sysinternals process dump utility
Copyright (C) 2023 Microsoft Corporation. All rights reserved. Licensed under the MIT license.
Mark Russinovich, Mario Hewardt, John Salem, Javid Habibi
Sysinternals - www.sysinternals.com

Monitors one or more processes and writes a core dump file when the processes exceeds the
specified criteria.

[02:57:34 - INFO]: Waiting for processes 'dotnet' to launch

[02:57:34 - INFO]: Press Ctrl-C to end monitoring without terminating the process(es).
Process Name:                           dotnet
CPU Threshold:                          n/a
Commit Threshold:                       >=30 MB
Thread Threshold:                       n/a
File Descriptor Threshold:              n/a
Signal:                                 n/a
Exception monitor                       Off
Polling Interval (ms):                  1000
Threshold (s):                          10
Number of Dumps:                        1
Output directory:                       /dumps
[02:57:34 - INFO]: Starting monitor for process dotnet (9)
info: Microsoft.Hosting.Lifetime[14]
      Now listening on: http://[::]:5001
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
      Content root path: /app/
[02:57:35 - INFO]: Trigger: Commit usage:48MB on process ID: 9
[createdump] Gathering state for process 9 dotnet
[createdump] Writing full dump to file /dumps/dotnet_commit_2023-06-26_02:57:35.9
[createdump] Written 254459904 bytes (62124 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written
[02:57:35 - INFO]: Core dump 0 generated: /dumps/dotnet_commit_2023-06-26_02:57:35.9
[02:57:36 - INFO]: Stopping monitors for process: dotnet (9)

[root@localhost data2]# ls -lh
total 243M
-rw-------. 1 root root 243M Jun 26 10:57 dotnet_commit_2023-06-26_02:57:35.9

從指令碼資訊看,當記憶體到了 48MB 的時候觸發的 dump 生成,也成功的進入了 /dumps 目錄中,太棒了。

3. cpu爆高 dump 捕獲

抓 cpu 爆高的dump最好的方式就是多抓幾個,比如說:當 CPU >20% 連續超過 5s 抓 2個dump,這種方式抓的dump很容易就能找到真凶,為了方便演示,讓兩個 cpu 直接打滿,參考程式碼如下:


        public static void Main(string[] args)
        {
            var builder = WebApplication.CreateBuilder(args);
            builder.Services.AddAuthorization();
            var app = builder.Build();
            app.UseAuthorization();

            //3. cpu
            app.MapGet("/cpu", (HttpContext httpContext) =>
            {
                Task.Factory.StartNew(() => { bool b = true; while (true) { b = !b; } });
                Task.Factory.StartNew(() => { bool b = true; while (true) { b = !b; } });

                return new WeatherForecast();
            });

            app.Run();
        }

接下來就是修改 dockerfile,因為我的虛擬機器器是 8 核心,如果兩個核心被打滿,那應該會佔用大概 24% 的 cpu 利用率,所以指令碼中就設定 20% 吧。


FROM mcr.microsoft.com/dotnet/aspnet:6.0 AS runtime
WORKDIR /app
COPY ./ ./

# 1. 使用中科大映象源
RUN sed -i 's/deb.debian.org/mirrors.ustc.edu.cn/g' /etc/apt/sources.list

# 2. 安裝 wget
RUN apt-get update && apt-get install -y gdb
RUN dpkg -i procdump.deb

RUN echo "#!/bin/bash \n\
procdump -c 20 -n 2 -s 5 -w dotnet /dumps & \n\
dotnet \$1 \n\
" > ./start.sh

RUN chmod +x ./start.sh

ENTRYPOINT ["./start.sh", "AspNetWebApi.dll"]

最後就是 docker 構建。


[root@localhost data]# docker build -t aspnetapp .
[+] Building 0.4s (13/13) FINISHED

[root@localhost data]# docker run -it --rm  -p 5001:5001 --privileged -v /data2:/dumps --name aspnetcore_sample aspnetapp

ProcDump v1.5 - Sysinternals process dump utility
Copyright (C) 2023 Microsoft Corporation. All rights reserved. Licensed under the MIT license.
Mark Russinovich, Mario Hewardt, John Salem, Javid Habibi
Sysinternals - www.sysinternals.com

Monitors one or more processes and writes a core dump file when the processes exceeds the
specified criteria.

[03:35:56 - INFO]: Waiting for processes 'dotnet' to launch

[03:35:56 - INFO]: Press Ctrl-C to end monitoring without terminating the process(es).
Process Name:                           dotnet
CPU Threshold:                          >= 20%
Commit Threshold:                       n/a
Thread Threshold:                       n/a
File Descriptor Threshold:              n/a
Signal:                                 n/a
Exception monitor                       Off
Polling Interval (ms):                  1000
Threshold (s):                          5
Number of Dumps:                        2
Output directory:                       /dumps
[03:35:56 - INFO]: Starting monitor for process dotnet (8)
info: Microsoft.Hosting.Lifetime[14]
      Now listening on: http://[::]:5001
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
      Content root path: /app/

看輸出是正在監控,接下來我們存取下網址: http://192.168.17.129:5001/cpu
稍等片刻之後就會生成兩個dump 檔案。

三:總結

雖然Docker中的 .NET 程式佔比較少,但把經驗總結出來還是很值得的,以後有人問怎麼抓,可以把這篇文章直接丟過去啦!

圖片名稱