Loading

ASP-NET-Core通过源码分析磁盘IO高的原因

问题场景

这两天用Jmeter给Webapi项目做压测时发现磁盘IO非常高,如下图:

但是对应的接口其实非常简单,只是读取数据库数据缓存到内存之后返回,不应该有这么高的磁盘IO,于是便开始排查原因,发现是缓存了HTTP请求响应体到磁盘的原因,下面记录一下分析过程

排查读写文件

排查过程要知道原因,我们得先看一下webapi的进程到底对哪些文件在进行读写操作,这里推荐一个工具:Process Monitor,下载之后我们点击界面上菜单中的Filter图标,筛选自己的进程名称(进程id或其它都可),就可以看到下面的数据

很明显,程序在大量地读取写入ASPNETCORE_****.tmp文件,用VSCODE打开之后可以看到里面是接口查询到的数据转换的Json字符串

{
    ....
    // 隐私数据,这里不展示了
}

下面我们通过代码分析原因

代码分析

这一步需要先在aspnetcore仓库clone一下官方代码,接着我们用VSCODE打开源码文件夹之后搜索.tmp并筛选.cs类型文件:

很明显,上面的FileBufferingWriteStreamFileBufferingReadStream就是我们要找的类,这里先看一下FileBufferingWriteStream的代码:

/// <summary>
/// A <see cref="Stream"/> that buffers content to be written to disk. Use <see cref="DrainBufferAsync(Stream, CancellationToken)" />
/// to write buffered content to a target <see cref="Stream" />.
/// </summary>
public sealed class FileBufferingWriteStream : Stream
{
    private const int DefaultMemoryThreshold = 32 * 1024; // 32k

    private readonly int _memoryThreshold;
    private readonly long? _bufferLimit;
    private readonly Func<string> _tempFileDirectoryAccessor;
     /// <summary>
    /// The maximum amount of memory in bytes to allocate before switching to a file on disk.
    /// </summary>
    /// <remarks>
    /// Defaults to 32kb.
    /// </remarks>
    public int MemoryThreshold => _memoryThreshold;
      /// <inheritdoc />
    public override void Write(byte[] buffer, int offset, int count)
    {
        ThrowArgumentException(buffer, offset, count);
        ThrowIfDisposed();

        if (_bufferLimit.HasValue && _bufferLimit - Length < count)
        {
            Dispose();
            throw new IOException("Buffer limit exceeded.");
        }

        // Allow buffering in memory if we're below the memory threshold once the current buffer is written.
        var allowMemoryBuffer = (_memoryThreshold - count) >= PagedByteBuffer.Length;
        if (allowMemoryBuffer)
        {
            // Buffer content in the MemoryStream if it has capacity.
            PagedByteBuffer.Add(buffer, offset, count);
            Debug.Assert(PagedByteBuffer.Length <= _memoryThreshold);
        }
        else
        {
            // If the MemoryStream is incapable of accommodating the content to be written
            // spool to disk.
            EnsureFileStream();

            // Spool memory content to disk.
            PagedByteBuffer.MoveTo(FileStream);

            FileStream.Write(buffer, offset, count);
        }
    }
}

为了简单化这里非核心代码都隐藏掉了

上面这些代码都非常简单,而且注释写的很清晰,第一个重要的属性就是MemoryThreshold,默认是32KB,我想应该大部分人都没配置这个参数,再看一下Write函数中有注释写明了写缓存的策略——如果写入当前缓冲区后低于内存阈值,则允许在内存中进行缓冲。这时候再看一下接口响应数据的大小——69KB!

虽然到这已经知道是什么原因了,但是本着打破砂锅问到底的原则,还是继续找找是怎么调用到这里的,继续在官方源码中搜索在哪里用到了这个类,一下就看到了想要的东西:

当前项目里面正好用的是NewtonsoftJson的序列化库,自己的项目里是这么写的:

mvcBuilder.AddNewtonsoftJson(p =>
            {
                //数据格式首字母小写 不使用驼峰
                // p.SerializerSettings.ContractResolver = new CamelCasePropertyNamesContractResolver();
                //不使用驼峰样式的key
                p.SerializerSettings.ContractResolver = new DefaultContractResolver();
                //忽略循环引用
                p.SerializerSettings.ReferenceLoopHandling = ReferenceLoopHandling.Ignore;
                p.SerializerSettings.DateFormatString = "yyyy/MM/dd HH:mm:ss";
                p.SerializerSettings.NullValueHandling = NullValueHandling.Include;
            });

我们看下里面是什么逻辑:

/// <summary>
/// Configures Newtonsoft.Json specific features such as input and output formatters.
/// </summary>
/// <param name="builder">The <see cref="IMvcBuilder"/>.</param>
/// <param name="setupAction">Callback to configure <see cref="MvcNewtonsoftJsonOptions"/>.</param>
/// <returns>The <see cref="IMvcBuilder"/>.</returns>
public static IMvcBuilder AddNewtonsoftJson(
    this IMvcBuilder builder,
    Action<MvcNewtonsoftJsonOptions> setupAction)
{
    if (builder == null)
    {
        throw new ArgumentNullException(nameof(builder));
    }

    if (setupAction == null)
    {
        throw new ArgumentNullException(nameof(setupAction));
    }
	// 核心  进入下一个函数
    NewtonsoftJsonMvcCoreBuilderExtensions.AddServicesCore(builder.Services);
    builder.Services.Configure(setupAction);

    return builder;
}

internal static void AddServicesCore(IServiceCollection services)
{
    services.TryAddSingleton<ObjectPoolProvider, DefaultObjectPoolProvider>();
    // 重要! NewtonsoftJsonMvcOptionsSetup类在下面继续
    services.TryAddEnumerable(
        ServiceDescriptor.Transient<IConfigureOptions<MvcOptions>, NewtonsoftJsonMvcOptionsSetup>());
    services.TryAddEnumerable(
        ServiceDescriptor.Transient<IApiDescriptionProvider, JsonPatchOperationsArrayProvider>());


    var jsonResultExecutor = services.FirstOrDefault(f =>
                                                     f.ServiceType == typeof(IActionResultExecutor<JsonResult>) &&
                                                     f.ImplementationType?.Assembly == typeof(JsonResult).Assembly);

    if (jsonResultExecutor != null)
    {
        services.Remove(jsonResultExecutor);
    }
    // 重要!
    services.TryAddSingleton<IActionResultExecutor<JsonResult>, NewtonsoftJsonResultExecutor>();

}

// NewtonsoftJsonMvcOptionsSetup类的核心函数
public void Configure(MvcOptions options)
{
    options.OutputFormatters.RemoveType<SystemTextJsonOutputFormatter>();
    options.OutputFormatters.Add(new NewtonsoftJsonOutputFormatter(_jsonOptions.SerializerSettings, _charPool, options, _jsonOptions));

    options.InputFormatters.RemoveType<SystemTextJsonInputFormatter>();
    // Register JsonPatchInputFormatter before JsonInputFormatter, otherwise
    // JsonInputFormatter would consume "application/json-patch+json" requests
    // before JsonPatchInputFormatter gets to see them.
    var jsonInputPatchLogger = _loggerFactory.CreateLogger<NewtonsoftJsonPatchInputFormatter>();
    options.InputFormatters.Add(new NewtonsoftJsonPatchInputFormatter(
        jsonInputPatchLogger,
        _jsonOptions.SerializerSettings,
        _charPool,
        _objectPoolProvider,
        options,
        _jsonOptions));

    var jsonInputLogger = _loggerFactory.CreateLogger<NewtonsoftJsonInputFormatter>();
    options.InputFormatters.Add(new NewtonsoftJsonInputFormatter(
        jsonInputLogger,
        _jsonOptions.SerializerSettings,
        _charPool,
        _objectPoolProvider,
        options,
        _jsonOptions));

    options.FormatterMappings.SetMediaTypeMappingForFormat("json", MediaTypeHeaderValues.ApplicationJson);

    options.ModelMetadataDetailsProviders.Add(new SuppressChildValidationMetadataProvider(typeof(IJsonPatchDocument)));
    options.ModelMetadataDetailsProviders.Add(new SuppressChildValidationMetadataProvider(typeof(JToken)));
}

以上这些代码,核心逻辑就是把NewtonsoftJson的处理方式替换掉原生的SystemTextJson,那么我们就进入NewtonsoftJsonOutputFormatter类看看它是怎么把http响应体写到缓存文件的:

/// <summary>
/// A <see cref="TextOutputFormatter"/> for JSON content.
/// </summary>
public class NewtonsoftJsonOutputFormatter : TextOutputFormatter
{
    // 写响应体函数
    public override async Task WriteResponseBodyAsync(OutputFormatterWriteContext context, Encoding selectedEncoding)
    {

        // Compat mode for derived options
        _jsonOptions ??= context.HttpContext.RequestServices.GetRequiredService<IOptions<MvcNewtonsoftJsonOptions>>().Value;

        var response = context.HttpContext.Response;

        var responseStream = response.Body;
        FileBufferingWriteStream? fileBufferingWriteStream = null;


        // 核心部分
        if (!_mvcOptions.SuppressOutputFormatterBuffering)
        {
            fileBufferingWriteStream = new FileBufferingWriteStream(_jsonOptions.OutputFormatterMemoryBufferThreshold);
            responseStream = fileBufferingWriteStream;
        }

        var value = context.Object;
        if (value is not null && _asyncEnumerableReaderFactory.TryGetReader(value.GetType(), out var reader))
        {
            var logger = context.HttpContext.RequestServices.GetRequiredService<ILogger<NewtonsoftJsonOutputFormatter>>();
            Log.BufferingAsyncEnumerable(logger, value);
            try
            {
                value = await reader(value, context.HttpContext.RequestAborted);
            }
            catch (OperationCanceledException) { }
            if (context.HttpContext.RequestAborted.IsCancellationRequested)
            {
                return;
            }
        }

        try
        {
            await using (var writer = context.WriterFactory(responseStream, selectedEncoding))
            {
                using var jsonWriter = CreateJsonWriter(writer);
                var jsonSerializer = CreateJsonSerializer(context);
                jsonSerializer.Serialize(jsonWriter, value);
            }

            if (fileBufferingWriteStream != null)
            {
                response.ContentLength = fileBufferingWriteStream.Length;
                await fileBufferingWriteStream.DrainBufferAsync(response.BodyWriter);
            }
        }
        finally
        {
            if (fileBufferingWriteStream != null)
            {
                await fileBufferingWriteStream.DisposeAsync();
            }
        }
    }
}

上面的代码中已经很明显了,就是在标记核心部分的地方写入到fileBufferingWriteStream里面,然后该类内部通过阈值参数选择内存流或者文件流来缓存,我们打个断点调试一下看看调用栈:

到了这一步就很清晰了,有什么不清楚的地方也可以单步调试一行一行看

解决方案

知道原因后在google搜索了一下也发现了一篇文章遇到了同样的问题:https://purple.telstra.com/blog/how-we-sped-up-an-aspnet-core-endpoint-from-20-seconds-down-to-4-seconds,文章中指出了两种解决方案,第一个是使用同步写入而不是异步,这个方案基本不用考虑,因为会造成线程阻塞,更加会造成吞吐量的下降,第二种就是用官方的SystemTextJson,因为是支持异步写入的,所以不需要文件缓冲,这里我看了一下源码,发现确实是这样

public sealed override async Task WriteResponseBodyAsync(OutputFormatterWriteContext context, Encoding selectedEncoding)
{
    if (context == null)
    {
        throw new ArgumentNullException(nameof(context));
    }

    if (selectedEncoding == null)
    {
        throw new ArgumentNullException(nameof(selectedEncoding));
    }

    var httpContext = context.HttpContext;

    // context.ObjectType reflects the declared model type when specified.
    // For polymorphic scenarios where the user declares a return type, but returns a derived type,
    // we want to serialize all the properties on the derived type. This keeps parity with
    // the behavior you get when the user does not declare the return type and with Json.Net at least at the top level.
    var objectType = context.Object?.GetType() ?? context.ObjectType ?? typeof(object);

    var responseStream = httpContext.Response.Body;
    if (selectedEncoding.CodePage == Encoding.UTF8.CodePage)
    {
        try
        {
            await JsonSerializer.SerializeAsync(responseStream, context.Object, objectType, SerializerOptions, httpContext.RequestAborted);
            await responseStream.FlushAsync(httpContext.RequestAborted);
        }
        catch (OperationCanceledException) when (context.HttpContext.RequestAborted.IsCancellationRequested) { }
    }
    else
    {
        // JsonSerializer only emits UTF8 encoded output, but we need to write the response in the encoding specified by
        // selectedEncoding
        var transcodingStream = Encoding.CreateTranscodingStream(httpContext.Response.Body, selectedEncoding, Encoding.UTF8, leaveOpen: true);

        ExceptionDispatchInfo? exceptionDispatchInfo = null;
        try
        {
            await JsonSerializer.SerializeAsync(transcodingStream, context.Object, objectType, SerializerOptions);
            await transcodingStream.FlushAsync();
        }
        catch (Exception ex)
        {
            // TranscodingStream may write to the inner stream as part of it's disposal.
            // We do not want this exception "ex" to be eclipsed by any exception encountered during the write. We will stash it and
            // explicitly rethrow it during the finally block.
            exceptionDispatchInfo = ExceptionDispatchInfo.Capture(ex);
        }
        finally
        {
            try
            {
                await transcodingStream.DisposeAsync();
            }
            catch when (exceptionDispatchInfo != null)
            {
            }

            exceptionDispatchInfo?.Throw();
        }
    }
}

这种方案我觉得是最推荐的,另外考虑到两种库的兼容性问题不想换的话,我觉得也可以提高内存缓冲的阈值,默认30KB,可以调到1MB或者自己想要的值,不过这种方式的话需要更大的内存,需要自己权衡利弊了

  mvcBuilder.AddNewtonsoftJson(p =>
            {
                // 想调多少自己设置
                p.InputFormatterMemoryBufferThreshold=1024*1024;
                p.OutputFormatterMemoryBufferThreshold = 1024 * 1024;

            });

注意http的请求体响应体超过阈值都会缓存!

这里我采用自己的第三种方案,再次用jmeter压测,结果符合预期

搞定收工!

参考链接

  1. https://learn.microsoft.com/en-us/dotnet/core/compatibility/aspnetcore#http-synchronous-io-disabled-in-all-servers
  2. https://purple.telstra.com/blog/how-we-sped-up-an-aspnet-core-endpoint-from-20-seconds-down-to-4-seconds
  3. 官方json库和newtonsoftjson库的差异:https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/migrate-from-newtonsoft?pivots=dotnet-7-0
posted @ 2023-01-19 10:37  李正浩  阅读(195)  评论(0编辑  收藏  举报