谨慎使用 ConcurrentDictionary.Values

谨慎使用 C# 中的 `ConcurrentDictionary.Values`

在多线程开发中，ConcurrentDictionary 是一个非常重要的数据结构，它提供了线程安全的字典操作。然而，在使用其 Values 属性时，我们需要格外小心，特别是在处理大数据量的场景中。本文通过一个示例程序分析了 ConcurrentDictionary.Values 的潜在问题，并探讨了优化方案。

问题描述

以下是一个简单的示例程序，它展示了在多线程环境中频繁调用 ConcurrentDictionary.Values 时的内存波动现象：

internal class Program
{
  static void Main(string[] args)
  {
      Parallel.For(1, 100000, i =>
      {
          Test();
          Console.WriteLine($"第{i}次调用");
      });
      Console.WriteLine("Hello, World!");
      Console.ReadLine();
  }

  public static void Test()
  {
      var query = CacheHelper.GetAll();
      Console.WriteLine($"{query.Count}");
      Thread.Sleep(100);
  }
}

public class CacheHelper
{
  static ConcurrentDictionary<string, string> allDic = new ConcurrentDictionary<string, string>();

  static CacheHelper()
  {
      for (int i = 0; i < 80000; i++)
      {
          allDic.TryAdd(i.ToString(), string.Join(",", Enumerable.Range(0, 500)));
      }
  }

  public static ICollection<string> GetAll()
  {
      return allDic.Values;
  }
}

现象分析

运行上述代码后，可以观察到程序内存占用不断上升，达到一个高峰后，内存被回收，但随后继续增长。这种内存波动在处理大字符串时尤为明显。通过 dotMemory 查看内存情况，如上图

源码分析

通过查看 ConcurrentDictionary 的源码，可以清楚地理解 Values 属性的工作机制：

private ReadOnlyCollection<TValue> GetValues()
{
  int locksAcquired = 0;
  try
  {
      AcquireAllLocks(ref locksAcquired);
      int countNoLocks = GetCountNoLocks();
      if (countNoLocks == 0)
      {
          return ReadOnlyCollection<TValue>.Empty;
      }
      TValue[] array = new TValue[countNoLocks];
      int num = 0;
      VolatileNode[] buckets = _tables._buckets;
      for (int i = 0; i < buckets.Length; i++)
      {
          VolatileNode volatileNode = buckets[i];
          for (Node node = volatileNode._node; node != null; node = node._next)
          {
              array[num] = node._value;
              num++;
          }
      }
      return new ReadOnlyCollection<TValue>(array);
  }
  finally
  {
      ReleaseLocks(locksAcquired);
  }
}

关键点
每次调用 Values 都会重新生成一个新数组：
```
TValue[] array = new TValue[countNoLocks];
```
这意味着每次获取 Values 都会创建一个新的 TValue[]，而不是返回 ConcurrentDictionary 内部的引用。这可能是为了线程安全而设计的，但在高并发场景下会导致频繁的内存分配。
存储对象的大小和数量会加剧问题：
在示例程序中，ConcurrentDictionary 存储了大量的长字符串。这使得每次调用 Values 时，生成的临时数组占用大量内存，GC 回收的压力显著增加。
早期版本的实现对比：
在 .NET 5 中，类似的逻辑使用了 List<TValue>，其本质行为与当前版本一致，依然会重新创建一个临时容器。
场景优化建议

针对上述问题，我们可以采取以下优化方案：
1. 避免频繁调用 ConcurrentDictionary.Values

在数据量较大或高并发场景中，尽量避免直接使用 ConcurrentDictionary.Values。根据具体需求，设计更高效的数据访问方式。

2. 使用 `lock` + `Dictionary` 替代

Dictionary 本身不是线程安全的，但其 Values 属性返回的是字典内部的引用，而不会重新分配内存。在某些场景下，可以采用 lock + Dictionary 替代。

示例代码如下：

public class CacheHelper
{
  private static Dictionary<string, string> allDic = new Dictionary<string, string>();
  private static readonly object lockObj = new object();

  static CacheHelper()
  {
      for (int i = 0; i < 80000; i++)
      {
          allDic.Add(i.ToString(), string.Join(",", Enumerable.Range(0, 500)));
      }
  }

  public static ICollection<string> GetAll()
  {
      lock (lockObj)
      {
          return allDic.Values;
      }
  }
}

通过这种方式，我们可以避免每次调用 Values 时分配大量新对象，同时保证线程安全。

总结

ConcurrentDictionary 是一个强大的线程安全数据结构，但在高并发、大数据量的场景下，使用其 Values 属性时需特别注意。通过了解其底层实现和内存分配机制，我们可以采取以下优化策略：
1. 减少 Values 的调用频率，避免频繁分配临时内存。
2. 在合适的场景下使用 lock + Dictionary 替代，既能保证线程安全，又能减少 GC 压力。
合理利用工具（如 dotMemory）分析内存行为，将有助于定位和优化类似问题。
## 参考链接
https://www.cnblogs.com/huangxincheng/p/15329098.html

posted @ 2025-01-11 23:30 dotNet编程拾光阅读(218) 评论(0) 收藏举报

刷新页面返回顶部

bcsg

谨慎使用 ConcurrentDictionary.Values

谨慎使用 C# 中的 `ConcurrentDictionary.Values`

问题描述

现象分析

源码分析

关键点

场景优化建议

1. 避免频繁调用 `ConcurrentDictionary.Values`

2. 使用 `lock` + `Dictionary` 替代

总结

公告

bcsg

谨慎使用 ConcurrentDictionary.Values

谨慎使用 C# 中的 ConcurrentDictionary.Values

问题描述

现象分析

源码分析

关键点

场景优化建议

1. 避免频繁调用 ConcurrentDictionary.Values

2. 使用 lock + Dictionary 替代

总结

公告

谨慎使用 C# 中的 `ConcurrentDictionary.Values`

1. 避免频繁调用 `ConcurrentDictionary.Values`

2. 使用 `lock` + `Dictionary` 替代