.NET應用系統的國際化-基於Roslyn抽取詞條、更新程式碼

2023-03-19 12:01:04

上篇文章我們介紹了

VUE+.NET應用系統的國際化-多語言詞條服務

系統國際化改造整體設計思路如下:

  1. 提供一個工具,識別前後端程式碼中的中文,形成多語言詞條,按語言、介面、模組統一管理多有的多語言詞條
  2. 提供一個翻譯服務,批次翻譯多語言詞條
  3. 提供一個詞條服務,支援後端程式碼在執行時根據使用者登入的語言,動態獲取對應的多語言文字
  4. 提供前端多語言JS生成服務,按介面動態生成對應的多語言JS檔案,方便前端VUE檔案使用。
  5. 提供程式碼替換工具,將VUE前端程式碼中的中文替換為$t("詞條ID"),後端程式碼中的中文替換為TermService.Current.GetText("詞條ID")

今天,我們在上篇文章的基礎上,繼續介紹基於Roslyn抽取詞條、更新程式碼。

一、業務背景

先說一下業務背景,後端.NET程式碼中存在大量的中文提示和異常訊息,甚至一些中文返回值文字。

這些中文文字都需要識別出來,抽取為多語言詞條,同時將程式碼替換為呼叫多語言詞條服務獲取翻譯後的文字。

例如:

private static void CheckMd5(string fileName, string md5Data)
{
      string md5Str = MD5Service.GetMD5(fileName);
      if (!string.Equals(md5Str, md5Data, StringComparison.OrdinalIgnoreCase))
      {
           throw new CustomException(PackageExceptionConst.FileMd5CheckFailed, "服務包檔案MD5校驗失敗:" + fileName);
      }
}

程式碼中需要將「服務包檔案MD5校驗失敗」這個文字做多語言改造。

這裡通過呼叫多語言詞條服務I18NTermService,根據執行緒上下文中設定的語言,獲取對應的翻譯文字。例如以下程式碼:

var text=T.Core.I18N.Service.TermService.Current.GetTextFormatted("詞條ID""預設文字"); 

throw new CustomException(PackageExceptionConst.FileMd5CheckFailed, text + fileName);

以上背景下,我們準備使用Roslyn技術對程式碼進行中文掃描,對掃描出來的文字,做詞條抽取、程式碼替換。

二、使用Roslyn技術對程式碼進行中文掃描

首先,我們先定義好程式碼中多語言詞條的掃描結果類TermScanResult

 1  [Serializable]
 2     public class TermScanResult
 3     {
 4         public Guid Id { get; set; }
 5         public string OriginalText { get; set; }
 6 
 7         public string ChineseText { get; set; }
 8 
 9         public string SlnName { get; set; }
10 
11         public string ProjectName { get; set; }
12 
13         public string ClassFile { get; set; }
14 
15         public string MethodName { get; set; }
16 
17         public string Code { get; set; }
18 
19         public I18NTerm I18NTerm { get; set; }
20 
21         public string SlnPath { get; set; }
22 
23         public string ClassPath { get; set; }
24 28         public string SubSystemCode { get; set; }
29 
30         public override string ToString()
31         {
32             return Code;
33         }
34     }

上述程式碼中SubSystemCode是一個業務管理維度。大家忽略即可。

我們會以sln解決方案為單位,掃描程式碼中的中文文字。

以下是具體的實現程式碼

public async Task<List<TermScanResult>> CheckSln(string slnPath, System.ComponentModel.BackgroundWorker backgroundWorker, SubSystemFile subSystemFiles, string subSystem)
{
            var slnFile = new FileInfo(slnPath);
            var results = new List<TermScanResult>();

            MSBuildHelper.RegisterMSBuilder();
            var solution = await MSBuildWorkspace.Create().OpenSolutionAsync(slnPath);

            var subSystemInfo = subSystemFiles?.SubSystemSlnMappings.FirstOrDefault(w => w.SlnName.Select(s => s += ".sln").Contains(slnFile.Name.ToLower()));

            if (solution.Projects != null && solution.Projects.Count() > 0)
            {
                foreach (var project in solution.Projects.ToList())
                {
                    backgroundWorker.ReportProgress(10, $"掃描Project: {project.Name}");
                    var documents = project.Documents.Where(x => x.Name.Contains(".cs"));

                    if (project.Name.ToLower().Contains("test"))
                    {
                        continue;
                    }
                    var codeReplace = new CodeReplace();
                    foreach (var document in documents)
                    {
                        var tree = await document.GetSyntaxTreeAsync();
                        var root = tree.GetCompilationUnitRoot();
                        if (root.Members == null || root.Members.Count == 0) continue;
                        //member
                        var classDeclartions = root.DescendantNodes().Where(i => i is ClassDeclarationSyntax);

                        foreach (var classDeclare in classDeclartions)
                        {
                            var programDeclaration = classDeclare as ClassDeclarationSyntax;
                            if (programDeclaration == null) continue;

                            foreach (var memberDeclarationSyntax in programDeclaration.Members)
                            {
                                foreach (var item in GetLiteralStringExpression(memberDeclarationSyntax))
                                {
                                    var statementCode = item.Item1;
                                    foreach (var syntaxNode in item.Item3)
                                    {
                                        ExpressionSyntaxParser expressionSyntaxParser = new ExpressionSyntaxParser();
                                        var text = "";
                                        var expressionSyntax = expressionSyntaxParser
                                            .GetExpressionSyntaxVerifyRule(syntaxNode as ExpressionSyntax, statementCode);
                                        if (expressionSyntax != null)
                                        {
                                            // 排除
                                            if (expressionSyntaxParser.IsExcludeCaller(expressionSyntax, statementCode))
                                            {
                                                continue;
                                            }

                                            text = expressionSyntaxParser.GetExpressionSyntaxOriginalText(expressionSyntax, statementCode);
                                            if (expressionSyntax is Microsoft.CodeAnalysis.CSharp.Syntax.InterpolatedStringExpressionSyntax)
                                            {
                                                text = expressionSyntaxParser.GetExpressionSyntaxOriginalText(expressionSyntax, statementCode);

                                                if (expressionSyntax is Microsoft.CodeAnalysis.CSharp.Syntax.LiteralExpressionSyntax)
                                                {
                                                    if (!expressionSyntax.IsKind(SyntaxKind.StringLiteralExpression))
                                                    {
                                                        continue;
                                                    }
                                                    text = expressionSyntax.NormalizeWhitespace().ToString();
                                                }
                                            }
                                        }
                                        if (CheckChinese(text) == false) continue;
                                        if (string.IsNullOrWhiteSpace(text)) continue;
                                        if (string.IsNullOrWhiteSpace(text.Replace("\"", "").Trim())) continue;

                                        results.Add(new TermScanResult()
                                        {
                                            Id = Guid.NewGuid(),
                                            ClassPath = programDeclaration.SyntaxTree.FilePath,
                                            SlnPath = slnPath,
                                            OriginalText = text.Replace("\"", "").Trim(),
                                            ChineseText = text,
                                            SlnName = slnFile.Name,
                                            ProjectName = project.Name,
                                            ClassFile = programDeclaration.Identifier.Text,
                                            MethodName = item.Item2,
                                            Code = statementCode,
                                            SubSystemCode = subSystem
                                        });
                                    }
                                }
                            }
                        }
                    }
                }
            }

     return results;
}

上述程式碼中,我們先使用MSBuilder編譯,構建 sln解決方案

MSBuildHelper.RegisterMSBuilder();
var solution = await MSBuildWorkspace.Create().OpenSolutionAsync(slnPath);

然後遍歷solution下的各個Project中的class類

foreach (var project in solution.Projects.ToList())
var documents = project.Documents.Where(x => x.Name.Contains(".cs"));

然後遍歷類中宣告、成員、方法中的每行程式碼,通過正規表示式識別是否有中文字元

public static bool CheckChinese(string strZh)
{
            Regex re = new Regex(@"[\u4e00-\u9fa5]+");
            if (re.IsMatch(strZh))
            {
                return true;
            }
            return false;
}

如果存在中文字元,作為掃描後的結果,識別為多語言詞條

results.Add(new TermScanResult()
{
        Id = Guid.NewGuid(),
        ClassPath = programDeclaration.SyntaxTree.FilePath,
        SlnPath = slnPath,
        OriginalText = text.Replace("\"", "").Trim(),
        ChineseText = text,
        SlnName = slnFile.Name,
        ProjectName = project.Name,
        ClassFile = programDeclaration.Identifier.Text,
        MethodName = item.Item2,
        Code = statementCode,        //管理維度                                  
        SubSystemCode = subSystem    //管理維度
});

TermScanResult中沒有對詞條屬性賦值。

public I18NTerm I18NTerm { get; set; }

下一篇文章的程式碼中,我們會通過多語言翻譯服務,將翻譯後的文字放到I18NTerm 屬性中,作為多語言詞條。

三、程式碼替換

程式碼替換這塊邏輯中,我們設計了一個類SourceWeaver,對上一步的程式碼掃描結果,進行程式碼替換

CodeScanReplace這個方法中完成了程式碼的二次掃描和替換
 /// <summary>
    /// 原始碼替換服務
    /// </summary>
    public class SourceWeaver
    {
        List<CommonTermDto> commonTerms = new List<CommonTermDto>();
        List<CommonTermDto> commSubTerms = new List<CommonTermDto>();

        public SourceWeaver()
        {
            commonTerms = JsonConvert.DeserializeObject<List<CommonTermDto>>(File.ReadAllText("comm_data.json"));
            commSubTerms = JsonConvert.DeserializeObject<List<CommonTermDto>>(File.ReadAllText("comm_sub_data.json"));
        }
        public async Task CodeScanReplace(Tuple<List<I18NTerm>, List<TermScanResult>> result, System.ComponentModel.BackgroundWorker backgroundWorker)
        {
            try
            {
                backgroundWorker.ReportProgress(0, "正在對程式碼進行替換.");
                var termScanResultGroupBy = result.Item2.GroupBy(g => g.SlnName);
                foreach (var termScanResult in termScanResultGroupBy)
                {
                    var termScan = termScanResult.FirstOrDefault();
                    MSBuildHelper.RegisterMSBuilder();
                    var solution = await MSBuildWorkspace.Create().OpenSolutionAsync(termScan.SlnPath).ConfigureAwait(false);
                    if (solution.Projects.Any())
                    {
                        foreach (var project in solution.Projects.ToList())
                        {
                            if (project.Name.ToLower().Contains("test"))
                            {
                                continue;
                            }
                            var projectTermScanResults = result.Item2.Where(f => f.ProjectName == project.Name);

                            var documents = project.Documents.Where(x =>
                            {
                                return x.Name.Contains(".cs") && projectTermScanResults.Any(f => $"{f.ClassPath}" == x.FilePath);
                            });

                            foreach (var document in documents)
                            {
                                var tree = await document.GetSyntaxTreeAsync().ConfigureAwait(false);
                                var root = tree.GetCompilationUnitRoot();
                                if (root.Members.Count == 0) continue;

                                var classDeclartions = root.DescendantNodes()
                                    .Where(i => i is ClassDeclarationSyntax);
                                List<MemberDeclarationSyntax> syntaxNodes = new List<MemberDeclarationSyntax>();
                                foreach (var classDeclare in classDeclartions)
                                {
                                    if (!(classDeclare is ClassDeclarationSyntax programDeclaration)) continue;
                                    var className = programDeclaration.Identifier.Text;

                                    foreach (var method in programDeclaration.Members)
                                    {
                                        if (method is ConstructorDeclarationSyntax)
                                        {
                                            syntaxNodes.Add((ConstructorDeclarationSyntax)method);
                                        }
                                        else if (method is MethodDeclarationSyntax)
                                        {
                                            syntaxNodes.Add((MethodDeclarationSyntax)method);
                                        }
                                        else if (method is PropertyDeclarationSyntax)
                                        {
                                            syntaxNodes.Add(method);
                                        }
                                        else if (method is FieldDeclarationSyntax)
                                        {
                                            // 注:常數不支援
                                            syntaxNodes.Add(method);
                                        }
                                    }
                                }

                                var terms = termScanResult.Where(
                                    f => f.ProjectName == document.Project.Name && f.ClassPath == document.FilePath).ToList();
                                backgroundWorker.ReportProgress(10, $"正在檢查{document.FilePath}檔案.");
                                ReplaceNodesAndSave(root, syntaxNodes, terms, result, backgroundWorker, document.Name);
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                LogUtils.LogError(string.Format("異常型別:{0}\r\n異常訊息:{1}\r\n異常資訊:{2}\r\n",
                    ex.GetType().Name, ex.Message, ex.StackTrace));
                backgroundWorker.ReportProgress(0, ex.Message);
            }
        }

        public async void ReplaceNodesAndSave(SyntaxNode classSyntaxNode, List<MemberDeclarationSyntax> syntaxNodes, IEnumerable<TermScanResult> terms, Tuple<List<I18NTerm>, List<TermScanResult>> result,
            System.ComponentModel.BackgroundWorker backgroundWorker, string className)
        {

            {//check pro是否存在詞條
                if (AppConfig.Instance.IsCheckTermPro)
                {
                    backgroundWorker.ReportProgress(15, $"詞條驗證中.");
                    var termsCodes = terms.Select(f => f.I18NTerm.Code).ToList();
                    var size = 100;
                    var p = (result.Item2.Count() + size - 1) / size;

                    using DBHelper dBHelper = new DBHelper();
                    List<I18NTerm> items = new List<I18NTerm>();
                    for (int i = 0; i < p; i++)
                    {
                        var list = termsCodes
                            .Skip(i * size).Take(size);
                        Thread.Sleep(10);
                        var segmentItems = await dBHelper.GetTermsAsync(termsCodes).ConfigureAwait(false);
                        items.AddRange(segmentItems);
                    }

                    List<TermScanResult> termScans = new List<TermScanResult>();
                    foreach (var term in terms)
                    {
                        if (items.Any(f => f.Code == term.I18NTerm.Code))
                        {
                            termScans.Add(term);
                        }
                        else
                        {
                            backgroundWorker.ReportProgress(20, $"詞條{term.OriginalText}未匯入到詞條庫,該詞條將忽略替換.");
                        }
                    }
                    terms = termScans;
                }
            }

            var newclassDeclare = classSyntaxNode;
            newclassDeclare = classSyntaxNode.ReplaceNodes(syntaxNodes,
                    (methodDeclaration, _) =>
                    {                     
                        MemberDeclarationSyntax newMemberDeclarationSyntax = methodDeclaration;
                        var className = ((ClassDeclarationSyntax)newMemberDeclarationSyntax.Parent).Identifier.Text;
                        List<StatementSyntax> statementSyntaxes = new List<StatementSyntax>();

                        switch (newMemberDeclarationSyntax)
                        {
                            case ConstructorDeclarationSyntax:
                                {
                                    var blockSyntax = (newMemberDeclarationSyntax as ConstructorDeclarationSyntax).NormalizeWhitespace().Body;
                                    if (blockSyntax == null)
                                    {
                                        break;
                                    }
                                    foreach (var statement in blockSyntax.Statements)
                                    {
                                        var nodeStatement = statement.DescendantNodes();

                                        statementSyntaxes.Add(new CodeReplace().ReplaceStatementNodes(statement,
                                            new ExpressionSyntaxParser().LiteralStringExpression(nodeStatement), terms, commonTerms, commSubTerms));
                                    }

                                    break;
                                }

                            case MethodDeclarationSyntax:
                                {
                                    var blockSyntax = (methodDeclaration as MethodDeclarationSyntax).NormalizeWhitespace().Body;
                                    if (blockSyntax == null)
                                    {
                                        break;
                                    }
                                    foreach (var statement in blockSyntax.Statements)
                                    {
                                        var nodeStatement = statement.DescendantNodes();
                                        statementSyntaxes.Add(new CodeReplace().ReplaceStatementNodes(statement,
                                               new ExpressionSyntaxParser().LiteralStringExpression(nodeStatement), terms, commonTerms, commSubTerms));
                                    }

                                    break;
                                }

                            case PropertyDeclarationSyntax:
                                {
                                    var propertyDeclarationSyntax = newMemberDeclarationSyntax as PropertyDeclarationSyntax;

                                    var nodeStatement = propertyDeclarationSyntax.DescendantNodes();

                                    return new CodeReplace().ReplacePropertyNodes(newMemberDeclarationSyntax as PropertyDeclarationSyntax,
                                        new ExpressionSyntaxParser().LiteralStringExpression(nodeStatement), terms, commonTerms, commSubTerms);
                                }

                            case FieldDeclarationSyntax:
                                {
                                    var fieldDeclarationSyntax = newMemberDeclarationSyntax as FieldDeclarationSyntax;
                                    var nodeStatement = fieldDeclarationSyntax.DescendantNodes();
                                    return new CodeReplace().ReplaceFiledNodes(fieldDeclarationSyntax,
                                           new ExpressionSyntaxParser().LiteralStringExpression(nodeStatement), terms, commonTerms, commSubTerms);
                                }
                        }
                        backgroundWorker.ReportProgress(50, $"解析並對類檔案{className}中的方法做語句替換.");
                        // 替換方法內部
                        if (newMemberDeclarationSyntax is MethodDeclarationSyntax)
                        {
                            return new CodeReplace().ReplaceMethodDeclaration(newMemberDeclarationSyntax as MethodDeclarationSyntax, statementSyntaxes);
                        }
                        else if (newMemberDeclarationSyntax is ConstructorDeclarationSyntax)
                        {
                            return new CodeReplace().ReplaceConstructorDeclaration(newMemberDeclarationSyntax as ConstructorDeclarationSyntax, statementSyntaxes);
                        }
                        return newMemberDeclarationSyntax;
                    });

            var sourceStr = newclassDeclare.NormalizeWhitespace().GetText().ToString();
            File.WriteAllText(newclassDeclare.SyntaxTree.FilePath, sourceStr);
            backgroundWorker.ReportProgress(100, $"完成{className}的替換.");
        }
    }

關鍵的程式碼語意替換的實現程式碼:

 public StatementSyntax ReplaceStatementNodes(StatementSyntax statement, List<ExpressionSyntax> expressionSyntaxes, IEnumerable<TermScanResult> terms
            , List<CommonTermDto> commonTerms, List<CommonTermDto> commSubTerms)
        {
            var statementSyntax = statement.ReplaceNodes(expressionSyntaxes, (syntaxNode, _) =>
            {
                var statementStr = statement.NormalizeWhitespace().ToString();

                var argumentLists = statement.DescendantNodes().
                                               OfType<InvocationExpressionSyntax>();
                ExpressionSyntaxParser expressionSyntaxParser = new ExpressionSyntaxParser();
                return expressionSyntaxParser.ExpressionSyntaxTermReplace(syntaxNode, statementStr, terms, commonTerms, commSubTerms);

            });

            return statementSyntax;
        }

這裡,我們抽象了一個ExpressionSyntaxParser 類,負責替換程式碼:

T.Core.I18N.Service.TermService.Current.GetTextFormatted
 public ExpressionSyntax ExpressionSyntaxTermReplace(ExpressionSyntax syntaxNode, string statementStr, IEnumerable<TermScanResult> terms
            , List<CommonTermDto> commonTerms, List<CommonTermDto> commSubTerms)
        {
            var expressionSyntax = GetExpressionSyntaxVerifyRule(syntaxNode, statementStr);
            var originalText = GetExpressionSyntaxOriginalText(expressionSyntax, statementStr);

            var I18Expr = "";
            var interpolationSyntaxes = syntaxNode.DescendantNodes().OfType<InterpolationSyntax>();         
            var term = terms.FirstOrDefault(i => i.ChineseText == originalText);

            if (term == null)
                return syntaxNode;
            string termcode = term.I18NTerm.Code;
if (syntaxNode is InterpolatedStringExpressionSyntax)
            {
                if (interpolationSyntaxes.Count() > 0)
                {
                    var parms = "";
                    foreach (var item in interpolationSyntaxes)
                    {
                        parms += $",{item.ToString().TrimStart('{').TrimEnd('}')}";
                    }
                    I18Expr = "$\"{T.Core.I18N.Service.TermService.Current.GetTextFormatted(\"" + termcode + "\", " + originalText + parms + ")}\"";
                    var token1 = SyntaxFactory.Token(default, SyntaxKind.StringLiteralToken, I18Expr, "", default);
                    return SyntaxFactory.LiteralExpression(SyntaxKind.StringLiteralExpression, token1);
                }
                else
                {

                    var startToken = SyntaxFactory.Token(SyntaxKind.InterpolatedStringStartToken);
                    if ((syntaxNode as InterpolatedStringExpressionSyntax).StringStartToken.Value == startToken.Value)
                    {
                        // 如果本身有"$"
                        I18Expr = "$\"{T.Core.I18N.Service.TermService.Current.GetText(\"" + termcode + "\"," + originalText + ")}";
                    }
                    else
                    {
                        // 如果沒有"$"
                        I18Expr = "$\"{T.Core.I18N.Service.TermService.Current.GetText(\"" + termcode + "\",\\teld\"" + originalText + "\")}";
                        I18Expr = I18Expr.Replace("\\teld", "$");
                    }
                }
            }
            else
            {
                I18Expr = "$\"{T.Core.I18N.Service.TermService.Current.GetText(\"" + termcode + "\"," + originalText + ")}";
            }

            var token = SyntaxFactory.Token(default(SyntaxTriviaList), SyntaxKind.InterpolatedVerbatimStringStartToken, I18Expr, "$\"", default(SyntaxTriviaList));
            var literalExpressionSyntax = SyntaxFactory.InterpolatedStringExpression(token);
            return literalExpressionSyntax;
        }
T.Core.I18N.Service.TermService這個就是多語言詞條服務類,這個類中提供了一個GetText的方法,通過詞條編號,獲取多語言文字。

程式碼完成替換後,開啟VS,對工程參照多語言詞條服務的Nuget包/dll,重新編譯程式碼,手工校對替換後的程式碼即可。
以上是.NET應用系統的國際化-基於Roslyn抽取詞條、更新程式碼的分享。



周國慶
2023/3/19