ASP-NET-Core通过源码分析磁盘IO高的原因
问题场景
这两天用Jmeter给Webapi项目做压测时发现磁盘IO非常高,如下图:
但是对应的接口其实非常简单,只是读取数据库数据缓存到内存之后返回,不应该有这么高的磁盘IO,于是便开始排查原因,发现是缓存了HTTP请求响应体到磁盘的原因,下面记录一下分析过程
排查读写文件
排查过程要知道原因,我们得先看一下webapi的进程到底对哪些文件在进行读写操作,这里推荐一个工具:Process Monitor,下载之后我们点击界面上菜单中的Filter图标,筛选自己的进程名称(进程id或其它都可),就可以看到下面的数据
很明显,程序在大量地读取写入ASPNETCORE_****.tmp
文件,用VSCODE打开之后可以看到里面是接口查询到的数据转换的Json字符串
{
....
// 隐私数据,这里不展示了
}
下面我们通过代码分析原因
代码分析
这一步需要先在aspnetcore仓库clone一下官方代码,接着我们用VSCODE打开源码文件夹之后搜索.tmp
并筛选.cs
类型文件:
很明显,上面的FileBufferingWriteStream
和FileBufferingReadStream
就是我们要找的类,这里先看一下FileBufferingWriteStream
的代码:
/// <summary>
/// A <see cref="Stream"/> that buffers content to be written to disk. Use <see cref="DrainBufferAsync(Stream, CancellationToken)" />
/// to write buffered content to a target <see cref="Stream" />.
/// </summary>
public sealed class FileBufferingWriteStream : Stream
{
private const int DefaultMemoryThreshold = 32 * 1024; // 32k
private readonly int _memoryThreshold;
private readonly long? _bufferLimit;
private readonly Func<string> _tempFileDirectoryAccessor;
/// <summary>
/// The maximum amount of memory in bytes to allocate before switching to a file on disk.
/// </summary>
/// <remarks>
/// Defaults to 32kb.
/// </remarks>
public int MemoryThreshold => _memoryThreshold;
/// <inheritdoc />
public override void Write(byte[] buffer, int offset, int count)
{
ThrowArgumentException(buffer, offset, count);
ThrowIfDisposed();
if (_bufferLimit.HasValue && _bufferLimit - Length < count)
{
Dispose();
throw new IOException("Buffer limit exceeded.");
}
// Allow buffering in memory if we're below the memory threshold once the current buffer is written.
var allowMemoryBuffer = (_memoryThreshold - count) >= PagedByteBuffer.Length;
if (allowMemoryBuffer)
{
// Buffer content in the MemoryStream if it has capacity.
PagedByteBuffer.Add(buffer, offset, count);
Debug.Assert(PagedByteBuffer.Length <= _memoryThreshold);
}
else
{
// If the MemoryStream is incapable of accommodating the content to be written
// spool to disk.
EnsureFileStream();
// Spool memory content to disk.
PagedByteBuffer.MoveTo(FileStream);
FileStream.Write(buffer, offset, count);
}
}
}
为了简单化这里非核心代码都隐藏掉了
上面这些代码都非常简单,而且注释写的很清晰,第一个重要的属性就是MemoryThreshold
,默认是32KB,我想应该大部分人都没配置这个参数,再看一下Write
函数中有注释写明了写缓存的策略——如果写入当前缓冲区后低于内存阈值,则允许在内存中进行缓冲。这时候再看一下接口响应数据的大小——69KB!
虽然到这已经知道是什么原因了,但是本着打破砂锅问到底的原则,还是继续找找是怎么调用到这里的,继续在官方源码中搜索在哪里用到了这个类,一下就看到了想要的东西:
当前项目里面正好用的是NewtonsoftJson
的序列化库,自己的项目里是这么写的:
mvcBuilder.AddNewtonsoftJson(p =>
{
//数据格式首字母小写 不使用驼峰
// p.SerializerSettings.ContractResolver = new CamelCasePropertyNamesContractResolver();
//不使用驼峰样式的key
p.SerializerSettings.ContractResolver = new DefaultContractResolver();
//忽略循环引用
p.SerializerSettings.ReferenceLoopHandling = ReferenceLoopHandling.Ignore;
p.SerializerSettings.DateFormatString = "yyyy/MM/dd HH:mm:ss";
p.SerializerSettings.NullValueHandling = NullValueHandling.Include;
});
我们看下里面是什么逻辑:
/// <summary>
/// Configures Newtonsoft.Json specific features such as input and output formatters.
/// </summary>
/// <param name="builder">The <see cref="IMvcBuilder"/>.</param>
/// <param name="setupAction">Callback to configure <see cref="MvcNewtonsoftJsonOptions"/>.</param>
/// <returns>The <see cref="IMvcBuilder"/>.</returns>
public static IMvcBuilder AddNewtonsoftJson(
this IMvcBuilder builder,
Action<MvcNewtonsoftJsonOptions> setupAction)
{
if (builder == null)
{
throw new ArgumentNullException(nameof(builder));
}
if (setupAction == null)
{
throw new ArgumentNullException(nameof(setupAction));
}
// 核心 进入下一个函数
NewtonsoftJsonMvcCoreBuilderExtensions.AddServicesCore(builder.Services);
builder.Services.Configure(setupAction);
return builder;
}
internal static void AddServicesCore(IServiceCollection services)
{
services.TryAddSingleton<ObjectPoolProvider, DefaultObjectPoolProvider>();
// 重要! NewtonsoftJsonMvcOptionsSetup类在下面继续
services.TryAddEnumerable(
ServiceDescriptor.Transient<IConfigureOptions<MvcOptions>, NewtonsoftJsonMvcOptionsSetup>());
services.TryAddEnumerable(
ServiceDescriptor.Transient<IApiDescriptionProvider, JsonPatchOperationsArrayProvider>());
var jsonResultExecutor = services.FirstOrDefault(f =>
f.ServiceType == typeof(IActionResultExecutor<JsonResult>) &&
f.ImplementationType?.Assembly == typeof(JsonResult).Assembly);
if (jsonResultExecutor != null)
{
services.Remove(jsonResultExecutor);
}
// 重要!
services.TryAddSingleton<IActionResultExecutor<JsonResult>, NewtonsoftJsonResultExecutor>();
}
// NewtonsoftJsonMvcOptionsSetup类的核心函数
public void Configure(MvcOptions options)
{
options.OutputFormatters.RemoveType<SystemTextJsonOutputFormatter>();
options.OutputFormatters.Add(new NewtonsoftJsonOutputFormatter(_jsonOptions.SerializerSettings, _charPool, options, _jsonOptions));
options.InputFormatters.RemoveType<SystemTextJsonInputFormatter>();
// Register JsonPatchInputFormatter before JsonInputFormatter, otherwise
// JsonInputFormatter would consume "application/json-patch+json" requests
// before JsonPatchInputFormatter gets to see them.
var jsonInputPatchLogger = _loggerFactory.CreateLogger<NewtonsoftJsonPatchInputFormatter>();
options.InputFormatters.Add(new NewtonsoftJsonPatchInputFormatter(
jsonInputPatchLogger,
_jsonOptions.SerializerSettings,
_charPool,
_objectPoolProvider,
options,
_jsonOptions));
var jsonInputLogger = _loggerFactory.CreateLogger<NewtonsoftJsonInputFormatter>();
options.InputFormatters.Add(new NewtonsoftJsonInputFormatter(
jsonInputLogger,
_jsonOptions.SerializerSettings,
_charPool,
_objectPoolProvider,
options,
_jsonOptions));
options.FormatterMappings.SetMediaTypeMappingForFormat("json", MediaTypeHeaderValues.ApplicationJson);
options.ModelMetadataDetailsProviders.Add(new SuppressChildValidationMetadataProvider(typeof(IJsonPatchDocument)));
options.ModelMetadataDetailsProviders.Add(new SuppressChildValidationMetadataProvider(typeof(JToken)));
}
以上这些代码,核心逻辑就是把NewtonsoftJson
的处理方式替换掉原生的SystemTextJson
,那么我们就进入NewtonsoftJsonOutputFormatter
类看看它是怎么把http响应体写到缓存文件的:
/// <summary>
/// A <see cref="TextOutputFormatter"/> for JSON content.
/// </summary>
public class NewtonsoftJsonOutputFormatter : TextOutputFormatter
{
// 写响应体函数
public override async Task WriteResponseBodyAsync(OutputFormatterWriteContext context, Encoding selectedEncoding)
{
// Compat mode for derived options
_jsonOptions ??= context.HttpContext.RequestServices.GetRequiredService<IOptions<MvcNewtonsoftJsonOptions>>().Value;
var response = context.HttpContext.Response;
var responseStream = response.Body;
FileBufferingWriteStream? fileBufferingWriteStream = null;
// 核心部分
if (!_mvcOptions.SuppressOutputFormatterBuffering)
{
fileBufferingWriteStream = new FileBufferingWriteStream(_jsonOptions.OutputFormatterMemoryBufferThreshold);
responseStream = fileBufferingWriteStream;
}
var value = context.Object;
if (value is not null && _asyncEnumerableReaderFactory.TryGetReader(value.GetType(), out var reader))
{
var logger = context.HttpContext.RequestServices.GetRequiredService<ILogger<NewtonsoftJsonOutputFormatter>>();
Log.BufferingAsyncEnumerable(logger, value);
try
{
value = await reader(value, context.HttpContext.RequestAborted);
}
catch (OperationCanceledException) { }
if (context.HttpContext.RequestAborted.IsCancellationRequested)
{
return;
}
}
try
{
await using (var writer = context.WriterFactory(responseStream, selectedEncoding))
{
using var jsonWriter = CreateJsonWriter(writer);
var jsonSerializer = CreateJsonSerializer(context);
jsonSerializer.Serialize(jsonWriter, value);
}
if (fileBufferingWriteStream != null)
{
response.ContentLength = fileBufferingWriteStream.Length;
await fileBufferingWriteStream.DrainBufferAsync(response.BodyWriter);
}
}
finally
{
if (fileBufferingWriteStream != null)
{
await fileBufferingWriteStream.DisposeAsync();
}
}
}
}
上面的代码中已经很明显了,就是在标记核心部分
的地方写入到fileBufferingWriteStream
里面,然后该类内部通过阈值参数选择内存流或者文件流来缓存,我们打个断点调试一下看看调用栈:
到了这一步就很清晰了,有什么不清楚的地方也可以单步调试一行一行看
解决方案
知道原因后在google搜索了一下也发现了一篇文章遇到了同样的问题:https://purple.telstra.com/blog/how-we-sped-up-an-aspnet-core-endpoint-from-20-seconds-down-to-4-seconds,文章中指出了两种解决方案,第一个是使用同步写入而不是异步,这个方案基本不用考虑,因为会造成线程阻塞,更加会造成吞吐量的下降,第二种就是用官方的SystemTextJson
,因为是支持异步写入的,所以不需要文件缓冲,这里我看了一下源码,发现确实是这样
public sealed override async Task WriteResponseBodyAsync(OutputFormatterWriteContext context, Encoding selectedEncoding)
{
if (context == null)
{
throw new ArgumentNullException(nameof(context));
}
if (selectedEncoding == null)
{
throw new ArgumentNullException(nameof(selectedEncoding));
}
var httpContext = context.HttpContext;
// context.ObjectType reflects the declared model type when specified.
// For polymorphic scenarios where the user declares a return type, but returns a derived type,
// we want to serialize all the properties on the derived type. This keeps parity with
// the behavior you get when the user does not declare the return type and with Json.Net at least at the top level.
var objectType = context.Object?.GetType() ?? context.ObjectType ?? typeof(object);
var responseStream = httpContext.Response.Body;
if (selectedEncoding.CodePage == Encoding.UTF8.CodePage)
{
try
{
await JsonSerializer.SerializeAsync(responseStream, context.Object, objectType, SerializerOptions, httpContext.RequestAborted);
await responseStream.FlushAsync(httpContext.RequestAborted);
}
catch (OperationCanceledException) when (context.HttpContext.RequestAborted.IsCancellationRequested) { }
}
else
{
// JsonSerializer only emits UTF8 encoded output, but we need to write the response in the encoding specified by
// selectedEncoding
var transcodingStream = Encoding.CreateTranscodingStream(httpContext.Response.Body, selectedEncoding, Encoding.UTF8, leaveOpen: true);
ExceptionDispatchInfo? exceptionDispatchInfo = null;
try
{
await JsonSerializer.SerializeAsync(transcodingStream, context.Object, objectType, SerializerOptions);
await transcodingStream.FlushAsync();
}
catch (Exception ex)
{
// TranscodingStream may write to the inner stream as part of it's disposal.
// We do not want this exception "ex" to be eclipsed by any exception encountered during the write. We will stash it and
// explicitly rethrow it during the finally block.
exceptionDispatchInfo = ExceptionDispatchInfo.Capture(ex);
}
finally
{
try
{
await transcodingStream.DisposeAsync();
}
catch when (exceptionDispatchInfo != null)
{
}
exceptionDispatchInfo?.Throw();
}
}
}
这种方案我觉得是最推荐的,另外考虑到两种库的兼容性问题不想换的话,我觉得也可以提高内存缓冲的阈值,默认30KB,可以调到1MB或者自己想要的值,不过这种方式的话需要更大的内存,需要自己权衡利弊了
mvcBuilder.AddNewtonsoftJson(p =>
{
// 想调多少自己设置
p.InputFormatterMemoryBufferThreshold=1024*1024;
p.OutputFormatterMemoryBufferThreshold = 1024 * 1024;
});
注意http的请求体和响应体超过阈值都会缓存!
这里我采用自己的第三种方案,再次用jmeter压测,结果符合预期
搞定收工!
参考链接
- https://learn.microsoft.com/en-us/dotnet/core/compatibility/aspnetcore#http-synchronous-io-disabled-in-all-servers
- https://purple.telstra.com/blog/how-we-sped-up-an-aspnet-core-endpoint-from-20-seconds-down-to-4-seconds
- 官方json库和newtonsoftjson库的差异:https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/migrate-from-newtonsoft?pivots=dotnet-7-0