GC优化:栈内存、span、NativeMemory、指针、池化内存 笔记
struct 与内存指针互转
其实,网上有多种方法,实测效率差别较大,有个项目对性能极其敏感,因此反复研究测试,得出了最高效率的方式。
先定义 struct 数据结构
[DebuggerDisplay("NameLength = {NameLength}, NodeIndex = {NodeIndex}, ParentNodeIndex = {ParentNodeIndex}, CreationTime = {StandardInformation.CreationTime}")]
[StructLayout(LayoutKind.Sequential, Pack = 1, CharSet = CharSet.Unicode)]
internal unsafe struct FileEntryNode : IEquatable<FileEntryNode> {
internal const byte MaxLength = 20;
internal const byte ExtensionNameMaxLength = 10;
// [FieldOffset(0)]
internal readonly Attributes Attributes;
// [FieldOffset(4)]
internal readonly UInt32 NodeIndex;
// [FieldOffset(8)]
internal readonly UInt32 ParentNodeIndex;
// [FieldOffset(16)]
internal readonly UInt64 Size;
// [FieldOffset(32)]
internal readonly StandardInformation StandardInformation;
internal readonly byte LogicalStatus;
internal byte NameLength;
internal readonly int NameOffset;
internal readonly int ExtensionNameIndex;
internal readonly byte ParentCount;
public readonly int PathLength;
}
从内存指针位置转为 struct 对象
经过多种测试,这个方法是最高效的
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public FileEntryNode* ReadNodePointerByPosition(long position){
Debug.Assert(position < Size);
// A void* variable then add a cast is required, direct convert not work: FileEntryNode* value = (FileEntryNode*)this.Pointer + position;
void* ptr = this.Pointer + position;
FileEntryNode* value = (FileEntryNode*)ptr;
return value;
}
如果不是以方法返回值形式返回结果而是也 out 输出参数形式,则这样效率最好:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Read(in long position, out T value){
Debug.Assert(position < Size);
if (ReadUseAccessor) {
Accessor.Read(position, out value);
}
else{
var valueSize = EntrySize;
byte* ptr = this.Pointer + position;
value = default(T);
var valuePtr = Unsafe.AsPointer(ref value);
Buffer.MemoryCopy(ptr, valuePtr, valueSize, valueSize);
// Unsafe.CopyBlockUnaligned(valuePtr, ptr, valueSize);
}
}
这个方法等同效果,但效率不如第一种。
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public PinyinFullNode ReadNode(in long position){
Debug.Assert(position < Size);
var valueSize = EntrySize;
byte* ptr = this.Pointer + position;
PinyinFullNode value = default(PinyinFullNode);
var valuePtr = Unsafe.AsPointer(ref value);
Buffer.MemoryCopy(ptr, valuePtr, valueSize, valueSize);
return value;
}
于此同时,如果你还有一个变长数据(就是长度不固定的),那么先用上面的方法,把长度记录到主体 struct,得到主题 struct 后,把长度带入进行转换。
经过测试,下面的代码效率是最高的:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public ReadOnlySpan<char> ReadString(in uint offset, in int length)
{
void* pointer = Pointer + offset;
var fileName = new ReadOnlySpan<char>(pointer, length);
return fileName;
}
以上是我反复研究测试得出的最高效率的代码,那么低效的方法是什么样的?
大概这样,网上去搜,基本都是这样的低效代码:
// converts byte[] to struct
public static T RawDeserialize(byte[] rawData, int position) {
int rawsize = Marshal.SizeOf(typeof(T));
if (rawsize > rawData.Length - position) throw new ArgumentException("Not enough data to fill struct. Array length from position: " + (rawData.Length - position) + ", Struct length: " + rawsize);
IntPtr buffer = Marshal.AllocHGlobal(rawsize);
Marshal.Copy(rawData, position, buffer, rawsize);
T retobj = (T)Marshal.PtrToStructure(buffer, typeof(T));
Marshal.FreeHGlobal(buffer);
return retobj;
}
// converts a struct to byte[]
public static byte[] RawSerialize(object anything) {
int rawSize = Marshal.SizeOf(anything);
IntPtr buffer = Marshal.AllocHGlobal(rawSize);
Marshal.StructureToPtr(anything, buffer, false);
byte[] rawDatas = new byte[rawSize];
Marshal.Copy(buffer, rawDatas, 0, rawSize);
Marshal.FreeHGlobal(buffer);
return rawDatas;
}
把 struct 写到内存指针
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Write(in long position, ref T value) {
Debug.Assert(position < Size);
if (WriteUseAccessor) {
this.Accessor.Write(position, ref value);
}
else {
var valueSize = EntrySize;
byte* ptr = this.Pointer + position;
var valuePtr = Unsafe.AsPointer(ref value);
// Unsafe.CopyBlockUnaligned(ptr, valuePtr, (uint) valueSize);
Buffer.MemoryCopy(valuePtr, ptr, valueSize, valueSize);
}
}
stackalloc
使用栈内存,减少GC压力
var wordMatchCounts = stackalloc float[wordCount];
Span
Span 支持 reinterpret_cast 的理念,即可以将 Span<byte> 强制转换为 Span<int>
Span 支持 reinterpret_cast 的理念,即可以将 Span<byte> 强制转换为 Span<int>(其中,Span<int> 中的索引 0 映射到 Span<byte> 的前四个字节)。这样一来,如果读取字节缓冲区,可以安全高效地将它传递到对分组字节(视作整数)执行操作的方法。
Span 也能装在集合之ValueListBuilder & .AsSpan()
.NET 内部提升性能对象:ValueListBuilder & .AsSpan()
ValueListBuilder & .AsSpan()
.NET Core 源码中的内部提升性能对象:ValueListBuilder & .AsSpan()
它在 String.Replace 中被使用
public unsafe string Replace(string oldValue, string? newValue) {
ArgumentException.ThrowIfNullOrEmpty(oldValue, nameof (oldValue));
if (newValue == null) newValue = string.Empty;
// ISSUE: untyped stack allocation
ValueListBuilder<int> valueListBuilder = new ValueListBuilder<int>(new Span<int>((void*) __untypedstackalloc(new IntPtr(512)), 128));
if (oldValue.Length == 1)
{
if (newValue.Length == 1)
return this.Replace(oldValue[0], newValue[0]);
char ch = oldValue[0];
int elementOffset = 0;
while (true)
{
int num = SpanHelpers.IndexOf(ref Unsafe.Add<char>(ref this._firstChar, elementOffset), ch, this.Length - elementOffset);
if (num >= 0){
valueListBuilder.Append(elementOffset + num);
elementOffset += num + 1;
}
else break;
}
}
else{
int elementOffset = 0;
while (true){
int num = SpanHelpers.IndexOf(ref Unsafe.Add<char>(ref this._firstChar, elementOffset), this.Length - elementOffset, ref oldValue._firstChar, oldValue.Length);
if (num >= 0){
valueListBuilder.Append(elementOffset + num);
elementOffset += num + oldValue.Length;
}
else break;
}
}
if (valueListBuilder.Length == 0) eturn this;
string str = this.ReplaceHelper(oldValue.Length, newValue, **valueListBuilder.AsSpan()**);
valueListBuilder.Dispose();
return str;
}
.NET 内部类直接将集合转回为 Span<T>:CollectionsMarshal.AsSpan<string>(List<string>)
private static unsafe string JoinCore<T>(ReadOnlySpan<char> separator, IEnumerable<T> values){
if (typeof (T) == typeof (string)){
if (values is List<string> list)
return string.JoinCore(separator, (ReadOnlySpan<string>) CollectionsMarshal.AsSpan<string>(list));
if (values is string[] array)
return string.JoinCore(separator, new ReadOnlySpan<string>(array));
}
ref struct,使用ref读取值类型,避免值类型拷贝
使用ref读取值类型,避免值类型拷贝,但要注意对当前值类型的修改,会影响被ref的那个值类型,因为本质上你在操作一个指针
ref var hierarchy = ref ph[i];
ref var words = ref hierarchy.Words;
Unsafe.IsNullRef
可以使用 Unsafe.IsNullRef
来判断一个 ref
是否为空。如果用户没有对 Foo.X
进行初始化,则默认是空引用:
ref struct Foo {
public ref int X;
public bool IsNull => Unsafe.IsNullRef(ref X);
public Foo(ref int x) { X = ref x; }
}
NativeMemory
相比 Marshal.AllocHGlobal 和 Marshal.FreeHGlobal,其实现在更推荐 NativeMemory.*,有诸多好处:
-
支持控制是否零初始化
-
支持控制内存对齐
-
参数是 nuint 类型,支持在 64 位进程上支持分配超过 int 上限的大小
string.Join 内部实现解析
CollectionsMarshal.AsSpan(valuesList)
if (values is List<string?> valuesList) {
return JoinCore(separator.AsSpan(), CollectionsMarshal.AsSpan(valuesList));
}
if (values is string?[] valuesArray)
{
return JoinCore(separator.AsSpan(), new ReadOnlySpan<string?>(valuesArray));
}
Join
public static string Join(string? separator, IEnumerable<string?> values)
{
if (values is List<string?> valuesList)
{
return JoinCore(separator.AsSpan(), CollectionsMarshal.AsSpan(valuesList));
}
if (values is string?[] valuesArray)
{
return JoinCore(separator.AsSpan(), new ReadOnlySpan<string?>(valuesArray));
}
if (values == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.values);
}
using (IEnumerator<string?> en = values.GetEnumerator())
{
if (!en.MoveNext())
{
return Empty;
}
string? firstValue = en.Current;
if (!en.MoveNext())
{
// Only one value available
return firstValue ?? Empty;
}
// Null separator and values are handled by the StringBuilder
var result = new ValueStringBuilder(stackalloc char[256]);
result.Append(firstValue);
do
{
result.Append(separator);
result.Append(en.Current);
}
while (en.MoveNext());
return result.ToString();
}
}
JoinCore
private static string JoinCore(ReadOnlySpan<char> separator, ReadOnlySpan<string?> values)
{
if (values.Length <= 1)
{
return values.IsEmpty ?
Empty :
values[0] ?? Empty;
}
long totalSeparatorsLength = (long)(values.Length - 1) * separator.Length;
if (totalSeparatorsLength > int.MaxValue)
{
ThrowHelper.ThrowOutOfMemoryException();
}
int totalLength = (int)totalSeparatorsLength;
// Calculate the length of the resultant string so we know how much space to allocate.
foreach (string? value in values)
{
if (value != null)
{
totalLength += value.Length;
if (totalLength < 0) // Check for overflow
{
ThrowHelper.ThrowOutOfMemoryException();
}
}
}
// Copy each of the strings into the result buffer, interleaving with the separator.
string result = FastAllocateString(totalLength);
int copiedLength = 0;
for (int i = 0; i < values.Length; i++)
{
// It's possible that another thread may have mutated the input array
// such that our second read of an index will not be the same string
// we got during the first read.
// We range check again to avoid buffer overflows if this happens.
if (values[i] is string value)
{
int valueLen = value.Length;
if (valueLen > totalLength - copiedLength)
{
copiedLength = -1;
break;
}
// Fill in the value.
FillStringChecked(result, copiedLength, value);
copiedLength += valueLen;
}
if (i < values.Length - 1)
{
// Fill in the separator.
// Special-case length 1 to avoid additional overheads of CopyTo.
// This is common due to the char separator overload.
ref char dest = ref Unsafe.Add(ref result._firstChar, copiedLength);
if (separator.Length == 1)
{
dest = separator[0];
}
else
{
separator.CopyTo(new Span<char>(ref dest, separator.Length));
}
copiedLength += separator.Length;
}
}
// If we copied exactly the right amount, return the new string. Otherwise,
// something changed concurrently to mutate the input array: fall back to
// doing the concatenation again, but this time with a defensive copy. This
// fall back should be extremely rare.
return copiedLength == totalLength ?
result :
JoinCore(separator, values.ToArray().AsSpan());
}
.NET Core 内部提升性能对象
ValueStringBuilder
分析 string.Join 源码能发现,它内部使用了一个非公开的 ValueStringBuilder,可以在构造它时指定使用栈内存或池化内存,降低 GC 压力和内存开销。
// Null separator and values are handled by the StringBuilder
var result = new ValueStringBuilder(stackalloc char[256]);
result.Append(firstValue);
do {
result.Append(separator);
result.Append(en.Current);
}
while (en.MoveNext());
return result.ToString();
对象复用
Array Pool 池化数组 & PooledList 池化集合
Array Pool 会在线程槽上,创建共享的数组池,需要数组时,去 Array Pool 取得,不用每次你创建数值导致频繁的内存分配,进而减轻 GC 次数。
PooledList 是一个三方库,它其实就是实现不直接 new Array 存储数据,而是使用 Array Pool 里的 Array 来存储数据。
然后通过 using 再结束使用后归还给 Array Pool,如果忘记归还,会通过终结器归还。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· 展开说说关于C#中ORM框架的用法!
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?