posts - 930,  comments - 588,  views - 402万
< 2025年2月 >
26 27 28 29 30 31 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 1
2 3 4 5 6 7 8

        在论坛或一些公共的地方, 经常要对客户提交的文本进行过滤,我们可以使用以下这种方法来实现:

/// <summary>
/// Censor 
/// </summary>
/// <remarks>http://wintersun.cnblogs.com</remarks>
public class Censor
{
    /// <summary>
    /// Gets or sets the censored words.
    /// </summary>
    /// <value>The censored words.</value>
    public IList<string> CensoredWords { get; private set; }

    /// <summary>
    /// Initializes a new instance of the <see cref="Censor"/> class.
    /// </summary>
    /// <param name="censoredWords">The censored words.</param>
    public Censor(IEnumerable<string> censoredWords)
    {
        if (censoredWords == null)
            throw new ArgumentNullException("censoredWords");

        CensoredWords = new List<string>(censoredWords);
    }

    /// <summary>
    /// Censors the text.
    /// </summary>
    /// <param name="text">The text.</param>
    /// <returns>CensorText</returns>
    public string CensorText(string text)
    {
        if (string.IsNullOrEmpty(text))
            throw new ArgumentNullException("text");

        string censoredText = text;

        foreach (string censoredWord in CensoredWords)
        {
            string regularExpression = ToRegexPattern(censoredWord);

            censoredText = Regex.Replace(censoredText, regularExpression, StarCensoredMatch,
              RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
        }

        return censoredText;
    }

    /// <summary>
    /// Toes the regex pattern.
    /// </summary>
    /// <param name="wildcardSearch">The wildcard search.</param>
    /// <returns></returns>
    private string ToRegexPattern(string wildcardSearch)
    {
        string regexPattern = Regex.Escape(wildcardSearch);

        regexPattern = regexPattern.Replace(@"\*", ".*?");
        regexPattern = regexPattern.Replace(@"\?", ".");

        if (regexPattern.StartsWith(".*?"))
        {
            regexPattern = regexPattern.Substring(3);
            regexPattern = @"(^\b)*?" + regexPattern;
        }

        regexPattern = @"\b" + regexPattern + @"\b";

        return regexPattern;
    }

    /// <summary>
    /// Stars the censored match.
    /// </summary>
    /// <param name="m">The m.</param>
    /// <returns></returns>
    private static string StarCensoredMatch(Match m)
    {
        string word = m.Captures[0].Value;

        return new string('*', word.Length);
    }
}

好的,接着来看UnitTest:

/// <summary>
/// Censors the text test.
/// </summary>
/// <remarks>http://wintersun.cnblogs.com</remarks>
[Test]
public void CensorTextTest()
{
    //arrange
    IList<string> censoredWords = new List<string>
    {
      "gosh",
      "drat",
      "darn*",
      "*fuck*",
      "ass hole"
    };

    Censor censor = new Censor(censoredWords);
    string result = string.Empty;

    //act
    result = censor.CensorText("I stubbed my toe. Gosh it hurts!");
    //assert
    Assert.AreEqual("I stubbed my toe. **** it hurts!", result);

    result = censor.CensorText("The midrate on the USD -> EUR forex trade has soured my day. Drat!");
    Assert.AreEqual("The midrate on the USD -> EUR forex trade has soured my day. ****!", result);

    result = censor.CensorText("Gosh darnit, my shoe laces are undone.fuck you ass hole.");
    Assert.AreEqual("**** ******, my shoe laces are undone.**** you ********.", result);
}

关于那个censoredWords,你可以从一个文本文件读出(File.GetAllLines),或使用其它数据源xml,DB.
随你了,希望这篇POST对您有帮助.

Author:Petter Liu   http://wintersun.cnblogs.com

posted on   PetterLiu  阅读(916)  评论(0编辑  收藏  举报
编辑推荐:
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· PostgreSQL 和 SQL Server 在统计信息维护中的关键差异
· C++代码改造为UTF-8编码问题的总结
阅读排行:
· 实操Deepseek接入个人知识库
· CSnakes vs Python.NET:高效嵌入与灵活互通的跨语言方案对比
· 【.NET】调用本地 Deepseek 模型
· Plotly.NET 一个为 .NET 打造的强大开源交互式图表库
· 上周热点回顾(2.17-2.23)
点击右上角即可分享
微信分享提示