我做的FFmpeg開源C#封裝庫Sdcb.FFmpeg

寫在前面：

該主題為2022年12月份.NET Conf China 2022我的主題，專案地址：https://github.com/sdcb/Sdcb.FFmpeg

對應的PPT可以從這下載：https://io.starworks.cc:88/cv-public/2022/.NET玩轉音視訊操作FFmpeg.pptx

對應的視訊可以從這裡觀看（從3:19:00開始）：https://bbs.csdn.net/topics/609897502

FFmpeg是知名的音訊視訊處理軟體，我平時工作生活中會經常用到。但同時我也是.NET程式設計師，在嘗試性的用C#呼叫FFmpeg時，有以下這些選擇：

程序外呼叫，比如：
- FFmpeg.NET
- MediaToolkit
- Xabe.Ffmpeg
基於C API平臺呼叫，比如：
- FFmpeg.AutoGen
- EmguFFmpeg
- Sdcb.FFmpeg

如果基於命令列的話，有以下優缺點：

優點：容易學習、入門方便、不與GPL開源協定衝突
基於程序互操作，依賴於標準流重定向管理狀態
輸入和輸出依賴於檔案，很難精細控制

如果是基於C API做平臺呼叫，則可以很好解決上面一些問題，有如下優缺點：

輸入和輸出可基於記憶體，可精細控制每一幀
效能方面減少了跨程序的損耗，更能有保障
缺點：C API程式碼比較複雜
缺點：業界普遍使用FFmpeg.AutoGen，在C#的基礎上糅合C指標，寫起來甚至比C API更復雜

我做了什麼？

受制於以上這些困難，我以業界普遍使用的開源專案FFmpeg.AutoGen為基礎，我我自己動手做了一個Sdcb.FFmpeg，它有如下優點：

保留所有直接呼叫C API的能力、保留跨平臺的能力
刪掉並完全重寫了ClangMacroParser依賴，因此比原版支援更多的宏解析
動態庫載入方式從手動LoadLibrary改為了自動的[DllImport]，這在.NET Core中可以自動從NuGet包中載入dll，這更符合.NET社群共識
刪掉了倉庫所有大二進位制依賴和大二進位制歷史，改成自動從網上下載，這縮小了倉庫體積
簡化了列舉名字，如AVCodecID.AV_CODEC_ID_H264 -> AVCodecID.H264
為許多C宏改造成了C#列舉，如ffmpeg.AV_DICT_MATCH_CASE -> AV_DICT_READ.MatchCase
除了底層封裝，還提供了中層（類）封裝和高層（幫助類）封裝，比如CodecContext和MediaDictionary
我製作了動態連結庫的NuGet包，這可以保障程式不需要安裝外部依賴直接就能執行

NuGet包列表

FFmpeg 5.x:

Package Link

Sdcb.FFmpeg

Sdcb.FFmpeg.runtime.windows-x64
FFmpeg 4.4.x:

Package Link

Sdcb.FFmpeg

Sdcb.FFmpeg.runtime.windows-x64

Package	Link
Sdcb.FFmpeg
Sdcb.FFmpeg.runtime.windows-x64

Package	Link
Sdcb.FFmpeg
Sdcb.FFmpeg.runtime.windows-x64

Linux/MacOS下如何使用？

Linux下你並不需要這些NuGet包，Linux的發行版本很多，這些發行版大都內建了FFmpeg這樣非常常見的庫，比如在Ubuntu 22.04中，就可以通過如下命令來安裝FFmpeg 5.x的動態連結庫：

apt update
apt install software-properties-common
add-apt-repository ppa:savoury1/ffmpeg4 -y
add-apt-repository ppa:savoury1/ffmpeg5 -y
apt update
apt install ffmpeg -y

如果是FFmpeg 4.x，則可以通過以下命令來安裝動態連結庫：

apt update
apt install software-properties-common
add-apt-repository ppa:savoury1/ffmpeg4 -y
apt update
apt install ffmpeg -y

如果是MacOS，則可以通過以下命令來安裝動態連結庫：

brew install ffmpeg

NuGet包一般會和libc相關的庫繫結，沒有很好的泛用性，而且一般Linux中有更好的解決方案，因此我沒有為Linux製作執行時NuGet包。

但不要理解錯了，Sdcb.FFmpeg在Linux中也是經過測試的，也執行得很好，Github Actions測試連結：https://github.com/sdcb/Sdcb.FFmpeg/actions

為什麼我要另起爐灶？

其實我並不是一上來就準備另起爐灶，一開始我受到北京大佬於宏偉這個EmguFFmpeg專案的啟發，覺得FFmpeg.AutoGen確實很難用，但只要依賴於FFmpeg.AutoGen，稍做點封裝，就能減少許多維護工作，為此我於2020~2021年一直在想辦法開發和維護這個開源專案：Sdcb.FFmpegAPIWrapper，這個專案是完全基於Sdcb.FFmpeg開發的，當時這個專案也已經基本完成（就是沒怎麼做宣傳、範例和教學）。

然而隨著專案的深入，我越來越覺得直接依賴於FFmpeg.AutoGen會導致程式碼過於「笨重」，比如同一套東西，原始的和「高階」的有兩種不同的寫法（比如同時存在AVCodecID.AV_CODEC_ID_H264和AVCodecID.H264，使用者大概率會迷失，因此經過了許久的迷茫期後我終於下定決心改造FFmpeg.AutoGen，整個改造的過程伴隨了大約一年的時間，最後就造就了今天的狀態。

6個範例演示Sdcb.FFmpeg

範例1 純程式碼生成視訊

可以理解這個範例是FFmpeg的「Hello World」，需要參照如下NuGet包：

Sdcb.FFmpeg 5.1.2
Sdcb.FFmpeg.runtime.windows-x64

需要參照以下名稱空間：

Sdcb.FFmpeg.Codecs
Sdcb.FFmpeg.Formats
Sdcb.FFmpeg.Raw
Sdcb.FFmpeg.Toolboxs.Extensions
Sdcb.FFmpeg.Toolboxs.Generators
Sdcb.FFmpeg.Utils

完整程式碼如下（點選展開）：

// this example is based on Sdcb.FFmpeg 5.1.2
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);

using FormatContext fc = FormatContext.AllocOutput(formatName: "mp4");
fc.VideoCodec = Codec.CommonEncoders.Libx264;
MediaStream vstream = fc.NewStream(fc.VideoCodec);
using CodecContext vcodec = new CodecContext(fc.VideoCodec)
{
    Width = 800,
    Height = 600,
    TimeBase = new AVRational(1, 30),
    PixelFormat = AVPixelFormat.Yuv420p,
    Flags = AV_CODEC_FLAG.GlobalHeader,
};
vcodec.Open(fc.VideoCodec);
vstream.Codecpar!.CopyFrom(vcodec);
vstream.TimeBase = vcodec.TimeBase;

string outputPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "muxing.mp4");
fc.DumpFormat(streamIndex: 0, outputPath, isOutput: true);

using IOContext io = IOContext.OpenWrite(outputPath);
fc.Pb = io;
fc.WriteHeader();
VideoFrameGenerator.Yuv420pSequence(vcodec.Width, vcodec.Height, 600)
	.ConvertFrames(vcodec)
	.EncodeAllFrames(fc, null, vcodec)
	.WriteAll(fc);
fc.WriteTrailer();

執行後應該可以在桌面上看到一個muxing.mp4的檔案，這個檔案就是通過上述程式碼生成的，這個視訊效果如下圖所示：

值得一提的是，我寫了VideoFrameGenerator.Yuv420pSequence，它輸入了少量引數，返回了IEnumerable<Frame>（或者在其它範例中IEnumerable<Packet>），這是我專案裡面非常常見的寫法，這樣既體現了C#語言簡明強大的魅力，又其實保障了資源管理和記憶體釋放。

範例2 壓制視訊

這個範例將展示如何將一個視訊壓制成如下引數，這些引數也是微信Windows桌面端視訊不受二壓的引數：

編碼：H264
視訊位元速率：600kbps以下
視訊解析度：未限制，但推薦長邊960
音訊編碼：AAC
音訊位元速率：48kbps

需要參照如下NuGet包：

Sdcb.FFmpeg 5.1.2
Sdcb.FFmpeg.runtime.windows-x64

需要參照如下名稱空間：

Sdcb.FFmpeg.Codecs
Sdcb.FFmpeg.Common
Sdcb.FFmpeg.Filters
Sdcb.FFmpeg.Formats
Sdcb.FFmpeg.Raw
Sdcb.FFmpeg.Toolboxs
Sdcb.FFmpeg.Toolboxs.Extensions
Sdcb.FFmpeg.Toolboxs.FilterTools
Sdcb.FFmpeg.Toolboxs.Generators
Sdcb.FFmpeg.Utils
static Sdcb.FFmpeg.Raw.ffmpeg
System.Collections.Concurrent
System.Runtime.CompilerServices
System.Threading.Tasks

完整程式碼如下（點選展開）：

void Main()
{
	FFmpegLogger.LogLevel = LogLevel.Error;
	FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);

	Task.Run(() => A7r3VideoToWechat(@"Y:\a7r3\2022-12-12\C0060.MP4")).Wait();
}

void A7r3VideoToWechat(string mp4Path)
{
	using FormatContext inFc = FormatContext.OpenInputUrl(mp4Path);
	inFc.LoadStreamInfo();

	// prepare input stream/codec
	MediaStream inAudioStream = inFc.GetAudioStream();
	using CodecContext audioDecoder = new(Codec.FindDecoderById(inAudioStream.Codecpar!.CodecId));
	audioDecoder.FillParameters(inAudioStream.Codecpar);
	audioDecoder.Open();
	audioDecoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioDecoder.Channels);

	MediaStream inVideoStream = inFc.GetVideoStream();
	using CodecContext videoDecoder = new(Codec.FindDecoderByName("h264_cuvid"));
	videoDecoder.FillParameters(inVideoStream.Codecpar!);
	videoDecoder.Open();

	// dest file
	string destFile = Path.Combine(Path.GetDirectoryName(mp4Path)!, Path.GetFileNameWithoutExtension(mp4Path) + "_wechat.mp4");
	using FormatContext outFc = FormatContext.AllocOutput(fileName: destFile);

	// dest encoder and streams
	outFc.AudioCodec = Codec.CommonEncoders.AAC;
	MediaStream outAudioStream = outFc.NewStream(outFc.AudioCodec);
	using CodecContext audioEncoder = new(outFc.AudioCodec)
	{
		Channels = 1,
		SampleFormat = outFc.AudioCodec.Value.NegociateSampleFormat(AVSampleFormat.Fltp),
		SampleRate = outFc.AudioCodec.Value.NegociateSampleRates(48000),
		BitRate = 48000
	};
	audioEncoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioEncoder.Channels);
	audioEncoder.TimeBase = new AVRational(1, audioEncoder.SampleRate);
	audioEncoder.Open(outFc.AudioCodec);
	outAudioStream.Codecpar!.CopyFrom(audioEncoder);

	outFc.VideoCodec = Codec.FindEncoderByName("libx264");
	MediaStream outVideoStream = outFc.NewStream(outFc.VideoCodec);
	using VideoFilterContext vfilter = VideoFilterContext.Create(inVideoStream, "scale=1920:-1");
	using CodecContext videoEncoder = new(outFc.VideoCodec)
	{
		Flags = AV_CODEC_FLAG.GlobalHeader,
		ThreadCount = Environment.ProcessorCount, 
		ThreadType = ffmpeg.FF_THREAD_FRAME,
		BitRate = 595_000
	};
	vfilter.ConfigureEncoder(videoEncoder);
	var dict = new MediaDictionary
	{
		//["qp"] = "30",
		["tune"] = "zerolatency",
		["preset"] = "veryfast"
	};
	videoEncoder.Open(outFc.VideoCodec, dict);
	//dict.Dump();
	outVideoStream.Codecpar!.CopyFrom(videoEncoder);
	outVideoStream.TimeBase = videoEncoder.TimeBase;

	// begin write
	using IOContext io = IOContext.OpenWrite(destFile);
	outFc.Pb = io;
	outFc.WriteHeader();

	MediaThreadQueue<Frame> decodingQueue = inFc
		.ReadPackets(inVideoStream.Index, inAudioStream.Index)
		.DecodeAllPackets(inFc, audioDecoder, videoDecoder)
		.ToThreadQueue(cancellationToken: QueryCancelToken, boundedCapacity: 64);

	MediaThreadQueue<Packet> encodingQueue = decodingQueue.GetConsumingEnumerable()
		.ApplyVideoFilters(vfilter)
		.ConvertAllFrames(audioEncoder, videoEncoder)
		.AudioFifo(audioEncoder)
		.EncodeAllFrames(outFc, audioEncoder, videoEncoder)
		.ToThreadQueue(cancellationToken: QueryCancelToken);

	CancellationTokenSource end = new();
	QueryCancelToken.Register(() => end.Cancel());
	Dictionary<int, PtsDts> ptsDts = new();
	Task.Run(async () =>
	{
		double totalDuration = Math.Max(inVideoStream.GetDurationInSeconds(), inAudioStream.GetDurationInSeconds());
		try
		{
			while (!end.IsCancellationRequested)
			{
				Log();
				await Task.Delay(1000, end.Token);
			}
		}
		finally
		{
			Log();
		}

		void Log() => Console.WriteLine($"{GetStatusText()}, dec/enc queue: {decodingQueue.Count}/{encodingQueue.Count}");
		string GetStatusText() => $"{(outVideoStream.TimeBase * ptsDts.GetValueOrDefault(outVideoStream.Index, PtsDts.Default).Dts).ToDouble():F2} of {totalDuration:F2}";
	});
	encodingQueue.GetConsumingEnumerable()
		.RecordPtsDts(ptsDts)
		.WriteAll(outFc);
	end.Cancel();
	outFc.WriteTrailer();
}

執行效果如圖（將500多MB壓縮為5MB）：

值得一提的是這裡的MediaThreadQueue<Frame>和MediaThreadQueue<Packet>，內部都是基於C#的BlockingCollection加多執行緒做的，這樣可能提高效率，保證效能。

範例3 建立gif（表情包？）

注意，我建立了一個demo網站可以用於演示該功能，可以點選「生成」按鈕，比如可以得到這樣的表情包：

我把所有有完整Visual Studio程式碼範例上傳到了Github，可以在這下載：https://github.com/sdcb/ffmpeg-wjz-sorry-generator

它有如下步驟和要點：

視訊解碼
將每一幀轉換為BGRA畫素格式
使用Direct2D讀取並繪製字幕
將每一幀輸入視訊過濾器，轉換為PAL8格式
將PAL8編碼畫素格式的幀編碼為gif

注意這個demo我用到了Direct2D，它基於這個開源專案做的：Vortice.Windows

範例4 實際桌面投屏（遠端桌面？）

這個可以實現將一臺電腦的螢幕內容，以較低的網路開銷，通過網路實時地傳輸到另一臺電腦，它的使用場景包含實時視訊通話、遠端投屏、遠端桌面控制等。

程式碼分為兩部分，桌面錄製-編碼-傳送端和遠端接收-解碼-顯示端。

桌面錄製-編碼-傳送端完整原始碼

需要參照NuGet包：

Sdcb.FFmpeg 4.4.3
Sdcb.FFmpeg.runtime.windows-x64 4.4.3
Sdcb.ScreenCapture

完整原始碼如下（點選展開）：

// This example was initially written based on Sdcb.FFmpeg 4.4.3 & Sdcb.ScreenCapture
void Main()
{
	StartService(QueryCancelToken);
}

void StartService(CancellationToken cancellationToken = default)
{
	var tcpListener = new TcpListener(IPAddress.Any, 5555);
	cancellationToken.Register(() => tcpListener.Stop());
	tcpListener.Start();

	while (!cancellationToken.IsCancellationRequested)
	{
		TcpClient client = tcpListener.AcceptTcpClient();
		Task.Run(() => ServeClient(client, cancellationToken));
	}
}

void ServeClient(TcpClient tcpClient, CancellationToken cancellationToken = default)
{
	try
	{
		using var _ = tcpClient;
		using NetworkStream stream = tcpClient.GetStream();
		using BinaryWriter writer = new(stream);
		RectI screenSize = ScreenCapture.GetScreenSize(screenId: 0);
		RdpCodecParameter rcp = new(AVCodecID.H264, screenSize.Width, screenSize.Height, AVPixelFormat.Bgr0);

		using CodecContext cc = new(Codec.CommonEncoders.Libx264RGB)
		{
			Width = rcp.Width,
			Height = rcp.Height,
			PixelFormat = rcp.PixelFormat,
			TimeBase = new AVRational(1, 20),
		};
		cc.Open(null, new MediaDictionary
		{
			["crf"] = "30",
			["tune"] = "zerolatency",
			["preset"] = "veryfast"
		});

		writer.Write(rcp.ToArray());
		using Frame source = new();
		foreach (Packet packet in ScreenCapture
			.CaptureScreenFrames(screenId: 0)
			.ToBgraFrame()
			.ConvertFrames(cc)
			.EncodeFrames(cc))
		{
			if (cancellationToken.IsCancellationRequested)
			{
				break;
			}
			writer.Write(packet.Data.Length);
			writer.Write(packet.Data.AsSpan());
		}
	}
	catch (IOException ex)
	{
		// Unable to write data to the transport connection: 遠端主機強迫關閉了一個現有的連線。.
		// Unable to write data to the transport connection: 你的主機中的軟體中止了一個已建立的連線。
		ex.Dump();
	}
}

public class Filo<T> : IDisposable
{
	private T? Item { get; set; }
	private ManualResetEventSlim Notify { get; } = new ManualResetEventSlim();

	public void Update(T item)
	{
		Item = item;
		Notify.Set();
	}

	public IEnumerable<T> Consume(CancellationToken cancellationToken = default)
	{
		while (!cancellationToken.IsCancellationRequested)
		{
			Notify.Wait(cancellationToken);
			yield return Item!;
		}
	}

	public void Dispose() => Notify.Dispose();
}

public static class BgraFrameExtensions
{
	public static IEnumerable<Frame> ToBgraFrame(this IEnumerable<LockedBgraFrame> bgras)
	{
		using Frame frame = new Frame();
		foreach (LockedBgraFrame bgra in bgras)
		{
			frame.Width = bgra.Width;
			frame.Height = bgra.Height;
			frame.Format = (int)AVPixelFormat.Bgra;
			frame.Data[0] = bgra.DataPointer;
			frame.Linesize[0] = bgra.RowPitch;
			yield return frame;
		}
	}
}

record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat)
{
	public byte[] ToArray()
	{
		byte[] data = new byte[16];
		Span<byte> span = data.AsSpan();
		BinaryPrimitives.WriteInt32LittleEndian(span, (int)CodecId);
		BinaryPrimitives.WriteInt32LittleEndian(span[4..], Width);
		BinaryPrimitives.WriteInt32LittleEndian(span[8..], Height);
		BinaryPrimitives.WriteInt32LittleEndian(span[12..], (int)PixelFormat);
		return data;
	}
}

值得一提的是Sdcb.ScreenCapture這個NuGet包也是我做的，它是基於DXGI的技術，錄屏時能做到記憶體0複製，可以實現每秒60幀錄屏且CPU佔用率很低。這裡挖個坑以後有機會介紹這個開源專案，Github地址如下：https://github.com/sdcb/Sdcb.ScreenCapture

遠端接收-解碼-顯示端完整原始碼

需要參照的NuGet包：

Sdcb.FFmpeg 4.4.3
Sdcb.FFmpeg.runtime.windows-x64 4.4.3
FlysEngine.Desktop

請點選展開顯示：

// This example was initially written based on Sdcb.FFmpeg 4.4.3 & FlysEngine.Desktop
#nullable enable

ManagedBgraFrame? managedFrame = null;
bool cancel = false;

unsafe void Main()
{
	using RenderWindow w = new();
	w.FormClosed += delegate { cancel = true; };
	Task decodingTask = Task.Run(() => DecodeThread(() => (3840, 2160)));

	w.Draw += (_, ctx) =>
	{
		ctx.Clear(Colors.CornflowerBlue);
		if (managedFrame == null) return;

		ManagedBgraFrame frame = managedFrame.Value;

		fixed (byte* ptr = frame.Data)
		{
			//new System.Drawing.Bitmap(frame.Width, frame.Height, frame.RowPitch, System.Drawing.Imaging.PixelFormat.Format32bppPArgb, (IntPtr)ptr).DumpUnscaled();
			BitmapProperties1 props = new(new PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied));
			using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(frame.Width, frame.Height), (IntPtr)ptr, frame.RowPitch, props);
			ctx.UnitMode = UnitMode.Dips;
			ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.NearestNeighbor);
		}
	};
	RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None));
}

async Task DecodeThread(Func<(int width, int height)> sizeAccessor)
{
	using TcpClient client = new TcpClient();
	await client.ConnectAsync(IPAddress.Loopback, 5555);
	using NetworkStream stream = client.GetStream();

	using BinaryReader reader = new(stream);
	RdpCodecParameter rcp = RdpCodecParameter.FromSpan(reader.ReadBytes(16));

	using CodecContext cc = new(Codec.FindDecoderById(rcp.CodecId))
	{
		Width = rcp.Width,
		Height = rcp.Height,
		PixelFormat = rcp.PixelFormat,
	};
	cc.Open(null);

	foreach (var frame in reader
		.ReadPackets()
		.DecodePackets(cc)
		.ConvertVideoFrames(sizeAccessor, AVPixelFormat.Bgra)
		.ToManaged()
		)
	{
		if (cancel) break;
		managedFrame = frame;
	}
}


public static class FramesExtensions
{
	public static IEnumerable<ManagedBgraFrame> ToManaged(this IEnumerable<Frame> bgraFrames, bool unref = true)
	{
		foreach (Frame frame in bgraFrames)
		{
			int rowPitch = frame.Linesize[0];
			int length = rowPitch * frame.Height;
			byte[] buffer = new byte[length];
			Marshal.Copy(frame.Data._0, buffer, 0, length);
			ManagedBgraFrame managed = new(buffer, length, length / frame.Height);
			if (unref) frame.Unref();
			yield return managed;
		}
	}
}

public record struct ManagedBgraFrame(byte[] Data, int Length, int RowPitch)
{
	public int Width => RowPitch / BytePerPixel;
	public int Height => Length / RowPitch;

	public const int BytePerPixel = 4;
}


public static class ReadPacketExtensions
{
	public static IEnumerable<Packet> ReadPackets(this BinaryReader reader)
	{
		using Packet packet = new();
		while (true)
		{
			int packetSize = reader.ReadInt32();
			if (packetSize == 0) yield break;

			byte[] data = reader.ReadBytes(packetSize);
			GCHandle dataHandle = GCHandle.Alloc(data, GCHandleType.Pinned);
			try
			{
				packet.Data = new DataPointer(dataHandle.AddrOfPinnedObject(), packetSize);
				yield return packet;
			}
			finally
			{
				dataHandle.Free();
			}
		}
	}
}

record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat)
{
	public static RdpCodecParameter FromSpan(ReadOnlySpan<byte> data)
	{
		return new RdpCodecParameter(
			CodecId: (AVCodecID)BinaryPrimitives.ReadInt32LittleEndian(data),
			Width: BinaryPrimitives.ReadInt32LittleEndian(data[4..]),
			Height: BinaryPrimitives.ReadInt32LittleEndian(data[8..]),
			PixelFormat: (AVPixelFormat)BinaryPrimitives.ReadInt32LittleEndian(data[12..]));
	}
}

兩者執行效果如圖：

可見傳輸延遲在0.28秒的樣子，這是通過libx264編碼通過yuv420p傳輸的我4k顯示器視訊，可見可以滿足實際網路會議演示、投屏直播、遠端控制方面的需求（如果是1080p延遲應該可以更低）。

注意該原始碼用上了我自己寫的開源Direct2D封裝引擎：FlysEngine，你不需要關注它的細節（只需要安裝NuGet包即可），但如果你碰巧關注，這裡又挖個坑看以後有機會介紹介紹，在這之前只需要知道的是它只對D3D11、DXGI、Direct2D、WIC、DirectWrite做了一層薄薄的封裝。

範例5 接收顯示RTSP攝像頭視訊

這個程式依賴於如下NuGet包：

FlysEngine.Desktop
Sdcb.FFmpeg 4.4.3
Sdcb.FFmpeg.runtime.windows-x64 4.4.3

完整程式碼（點選展開）：

#nullable enable

FFmpegBmp? ffBmp = null;
FFmpegBmp? lastFFbmp = null;
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);
CancellationTokenSource cts = new();

using RenderWindow w = new();
Task.Run(() => DecodeRTSP(Util.GetPassword("home-rtsp-ipc"), cts.Token));
w.Draw += (_, ctx) =>
{
	if (ffBmp == null) return;
	if (lastFFbmp == ffBmp) return;

	GCHandle handle = GCHandle.Alloc(ffBmp.Data, GCHandleType.Pinned);
	try
	{
		using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(ffBmp.Width, ffBmp.Height), handle.AddrOfPinnedObject(), ffBmp.RowPitch, new BitmapProperties(new Vortice.DCommon.PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied)));
		lastFFbmp = ffBmp;
		Size clientSize = ctx.Size;
		float top = (clientSize.Height - ffBmp.Height) / 2;
		ctx.Transform = Matrix3x2.CreateTranslation(0, top);
		ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.Linear);
	}
	finally
	{
		handle.Free();
	}
};
w.FormClosing += delegate { cts.Cancel(); };
RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None));

void DecodeRTSP(string url, CancellationToken cancellationToken = default)
{
	using FormatContext fc = FormatContext.OpenInputUrl(url);
	fc.LoadStreamInfo();
	MediaStream videoStream = fc.GetVideoStream();

	using CodecContext videoDecoder = new CodecContext(Codec.FindDecoderByName("hevc_qsv"));
	videoDecoder.FillParameters(videoStream.Codecpar!);
	videoDecoder.Open();

	foreach (Frame frame in fc
		.ReadPackets(videoStream.Index)
		.DecodePackets(videoDecoder)
		.ConvertVideoFrames(() => new(w.ClientSize.Width, w.ClientSize.Width * videoDecoder.Height / videoDecoder.Width), AVPixelFormat.Bgr0))
	{
		if (cancellationToken.IsCancellationRequested) break;

		try
		{
			byte[] data = new byte[frame.Linesize[0] * frame.Height];
			Marshal.Copy(frame.Data._0, data, 0, data.Length);
			ffBmp = new FFmpegBmp(frame.Width, frame.Height, frame.Linesize[0], data);
		}
		finally
		{
			frame.Unref();
		}
	}
}

public record FFmpegBmp(int Width, int Height, int RowPitch, byte[] Data);

我農村老家的攝像頭使用的是RTSP攝像頭，這是使用上述程式碼的執行效果：

範例6 讀RTSP流並儲存為mp4/mov檔案

這個範例依賴於以下NuGet包：

Sdcb.FFmpeg 4.4.3
Sdcb.FFmpeg.runtime.windows-x64 4.4.3

完整程式碼範例（請點選展開）：

// The example was initially written using Sdcb.FFmpeg 4.4.3
FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg);

using FormatContext inFc = FormatContext.OpenInputUrl(Util.GetPassword("home-rtsp-ipc"));
inFc.LoadStreamInfo();
MediaStream inAudioStream = inFc.GetAudioStream();
MediaStream inVideoStream = inFc.GetVideoStream();
long gpts_v = 0, gpts_a = 0, gdts_v = 0, gdts_a = 0;

while (!QueryCancelToken.IsCancellationRequested)
{
	using FormatContext outFc = FormatContext.AllocOutput(formatName: "mov");
	string dir = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "rtsp", DateTime.Now.ToString("yyyy-MM-dd"));
	Directory.CreateDirectory(dir);
	using IOContext io = IOContext.OpenWrite(Path.Combine(dir, $"{DateTime.Now:HHmmss}.mov"));
	outFc.Pb = io;

	MediaStream videoStream = outFc.NewStream(Codec.FindEncoderById(inVideoStream.Codecpar!.CodecId));
	videoStream.Codecpar!.CopyFrom(inVideoStream.Codecpar);
	videoStream.TimeBase = inVideoStream.RFrameRate.Inverse();
	videoStream.SampleAspectRatio = inVideoStream.SampleAspectRatio;

	MediaStream audioStream = outFc.NewStream(Codec.FindEncoderById(inAudioStream.Codecpar!.CodecId));
	audioStream.Codecpar!.CopyFrom(inAudioStream.Codecpar);
	audioStream.TimeBase = inAudioStream.TimeBase;
	audioStream.Codecpar.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(inAudioStream.Codecpar.Channels);

	outFc.WriteHeader();
	
	FilterPackets(inFc.ReadPackets(inAudioStream.Index, inVideoStream.Index), videoFrameCount: 60 * 20)
		.WriteAll(outFc);
	outFc.WriteTrailer();

	IEnumerable<Packet> FilterPackets(IEnumerable<Packet> packets, int videoFrameCount)
	{
		long pts_v = gpts_v, pts_a = gpts_a, dts_v = gdts_v, dts_a = gdts_a;
		long[] buffer = new long[200];
		long ithreshold = -1;
		int videoFrame = 0;

		foreach (Packet pkt in packets)
		{
			pkt.StreamIndex = pkt.StreamIndex == inAudioStream.Index ?
					audioStream.Index :
					videoStream.Index;
			if (pkt.StreamIndex == inAudioStream.Index)
			{
				// audio
				(gpts_a, gdts_a, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_a, pkt.Dts - dts_a);
				pkt.RescaleTimestamp(inAudioStream.TimeBase, audioStream.TimeBase);
			}
			else
			{
				// video
				if (videoFrame < buffer.Length)
				{
					buffer[videoFrame] = pkt.Data.Length;
					ithreshold = -1;
				}
				else if (videoFrame == buffer.Length)
				{
					ithreshold = buffer.Order().ToArray()[buffer.Length / 2] * 4;
				}
				
				if (videoFrame >= videoFrameCount && pkt.Data.Length > ithreshold)
				{
					break;
				}

				(gpts_v, gdts_v, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_v, pkt.Dts - dts_v);
				pkt.RescaleTimestamp(inVideoStream.TimeBase, videoStream.TimeBase);
				videoFrame++;
			}
			yield return pkt;
		}
	}
}

這個程式可以全天候執行，執行後RTSP攝像頭錄的完整視訊和音訊，大約每1.5分鐘對應一個視訊檔，都會儲存到桌面的這個資料夾中（如圖）：

這樣的話也許就有機會取代錄機了~

總結與展望

我認為把東西做出來和把東西做好是有區別的，以前在C#裡面東西也就是「能用」的狀態，這和許多node.js或者python那樣的極客玩家有本質區別，希望通過這樣一個開源專案能向「.NET作為第一等公民」方向努力。

維護開源不易，喜歡的朋友請點個贊，賞個star：https://github.com/sdcb/Sdcb.FFmpeg

我也想能給自己立個flag，希望未來我可以封裝FlyCV、libyuv、x264基於libaom-av1，甚至也許有一點有機會做一個.NET版本的FFmpeg。

喜歡的朋友請關注我的微信公眾號：【DotNet騷操作】