dremio arrow flight 协议server实现——DremioFlightProducer代码简单介绍
DremioFlightProducer 包含了dremio 关于arrow flight 实现的核心部分
FlightProducer 接口定义
对于producer 的实现主要是实现 FlightProducer,包含的方法如下
方法代表的意义
dremio 对于FlightProducer的实现处理
因为dremio 属于一个查询操作(但是对于特殊存储也支持create table 操作,nas,hdfs,s3),目前对于部分flight 协议方法会直接提示未实现
未实现包含的方法 (注意这个并不是全部都是这样的,部分支持flight 协议的就需要实现其他的,比如CoordinatorFlightProducer的),对于直接
dremio 的实现很简单location 直接就是访问的机器,所以实现也相对简单,对于CoordinatorFlightProducer 需要实现的就比较多了,具体可以参考
实现
@Override
public Runnable acceptPut(CallContext callContext, FlightStream flightStream, StreamListener<PutResult> streamListener) {
throw CallStatus.UNIMPLEMENTED.withDescription("acceptPut is unimplemented").toRuntimeException();
}
@Override
public void doAction(CallContext callContext, Action action, StreamListener<Result> streamListener) {
throw CallStatus.UNIMPLEMENTED.withDescription("doAction is unimplemented").toRuntimeException();
}
@Override
public void listActions(CallContext callContext, StreamListener<ActionType> streamListener) {
throw CallStatus.UNIMPLEMENTED.withDescription("listActions is unimplemented").toRuntimeException();
}
@Override
public void listFlights(CallContext callContext, Criteria criteria, StreamListener<FlightInfo> streamListener) {
throw CallStatus.UNIMPLEMENTED.withDescription("listFlights is unimplemented").toRuntimeException();
}
所以dremio 主要实现了4个方法,但是默认doExchange(暂时不支持)以及getSchema在接口定义中使用了default ,所以
核心需要实现是两个了,getStream以及getFlightInfo ,从api 定义上主要是处理部分data stream 信息
以上两个方法都依赖了session处理,核心是账户信息,方便权限认证以及获取其他配置信息,比如引擎相关的,以及路由,具体参考
sabot/kernel/src/main/java/com/dremio/sabot/rpc/user/UserSession.java,上边两个方法都依赖dremio 自己包装的FlightWorkManager
getStream 以及getFlightInfo代码处理
@Override
public void getStream(CallContext callContext, Ticket ticket, ServerStreamListener serverStreamListener) {
try {
final CallHeaders headers = retrieveHeadersFromCallContext(callContext);
final UserSession session = sessionsManager.getUserSession(callContext.peerIdentity(), headers);
final TicketContent.PreparedStatementTicket preparedStatementTicket = TicketContent.PreparedStatementTicket.parseFrom(ticket.getBytes());
// 此方法基于userworker 包装了一个具体的执行,并基于回调获取数据
flightWorkManager.runPreparedStatement(preparedStatementTicket, serverStreamListener, allocator, session);
} catch (InvalidProtocolBufferException ex) {
final RuntimeException error = CallStatus.INVALID_ARGUMENT.withCause(ex).withDescription("Invalid ticket used in getStream").toRuntimeException();
serverStreamListener.error(error);
throw error;
}
}
@Override
public FlightInfo getFlightInfo(CallContext callContext, FlightDescriptor flightDescriptor) {
final CallHeaders headers = retrieveHeadersFromCallContext(callContext);
final UserSession session = sessionsManager.getUserSession(callContext.peerIdentity(), headers);
final FlightPreparedStatement flightPreparedStatement = flightWorkManager
.createPreparedStatement(flightDescriptor, callContext::isCancelled, session);
// 获取Flight 信息也是userworker的rpc 调用,只是处理比较快,在包装的使用时候了一个阻塞处理(while 循环,基于timeout 处理返回),保证可以获取需要的数据
return flightPreparedStatement.getFlightInfo(location);
}
FlightPreparedStatement 对于阻塞数据获取的处理
public FlightInfo getFlightInfo(Location location) {
//
final UserProtos.CreatePreparedStatementArrowResp createPreparedStatementResp = responseHandler.get();
final Schema schema = buildSchema(createPreparedStatementResp.getPreparedStatement().getArrowSchema());
final PreparedStatementTicket preparedStatementTicketContent = PreparedStatementTicket.newBuilder()
.setQuery(query)
.setHandle(createPreparedStatementResp.getPreparedStatement().getServerHandle())
.build();
final Ticket ticket = new Ticket(preparedStatementTicketContent.toByteArray());
final FlightEndpoint flightEndpoint = new FlightEndpoint(ticket, location);
return new FlightInfo(schema, flightDescriptor, ImmutableList.of(flightEndpoint), -1, -1);
}
FlightWorkManager 依赖UserWorker以及OptionManager UserWorker 主要是处理任务提交的(以后会有相关实现介绍)OptionManager
主要是关于flight 配置相关的,FlightWorkManager 主要包含两个方法
public FlightPreparedStatement createPreparedStatement(FlightDescriptor flightDescriptor,
Supplier<Boolean> isRequestCancelled, UserSession userSession)
public void runPreparedStatement(TicketContent.PreparedStatementTicket ticket, FlightProducer.ServerStreamListener listener,
BufferAllocator allocator, UserSession userSession)
runPreparedStatement 方法是直接执行,使用了回调机制进行结果处理,主要是RunQueryResponseHandler 实现了UserResponseHandler
对于提交的job 我们可以通过UserResponseHandler 回调处理结果数据
public void runPreparedStatement(TicketContent.PreparedStatementTicket ticket, FlightProducer.ServerStreamListener listener,
BufferAllocator allocator, UserSession userSession) {
final UserBitShared.ExternalId runExternalId = ExternalIdHelper.generateExternalId();
final UserRequest userRequest =
new UserRequest(UserProtos.RpcType.RUN_QUERY,
UserProtos.RunQuery.newBuilder()
.setType(UserBitShared.QueryType.PREPARED_STATEMENT)
.setPriority(UserProtos.QueryPriority.newBuilder()
.setWorkloadType(UserBitShared.WorkloadType.FLIGHT)
.setWorkloadClass(UserBitShared.WorkloadClass.GENERAL))
.setSource(UserProtos.SubmissionSource.FLIGHT)
.setPreparedStatementHandle(ticket.getHandle())
.build());
// listener 包装一个responseHandler
final UserResponseHandler responseHandler = runQueryResponseHandlerFactory.getHandler(runExternalId, userSession,
workerProvider, optionManagerProvider, listener, allocator);
workerProvider.get().submitWork(runExternalId, userSession, responseHandler, userRequest, TerminationListenerRegistry.NOOP);
}
createPreparedStatement 实际上也是一个任务的调度执行,只是包装为一个FlightPreparedStatement
public FlightPreparedStatement createPreparedStatement(FlightDescriptor flightDescriptor,
Supplier<Boolean> isRequestCancelled, UserSession userSession) {
final String query = getQuery(flightDescriptor);
final UserProtos.CreatePreparedStatementArrowReq createPreparedStatementReq =
UserProtos.CreatePreparedStatementArrowReq.newBuilder()
.setSqlQuery(query)
.build();
final UserBitShared.ExternalId prepareExternalId = ExternalIdHelper.generateExternalId();
final UserRequest userRequest =
new UserRequest(UserProtos.RpcType.CREATE_PREPARED_STATEMENT_ARROW, createPreparedStatementReq);
final CreatePreparedStatementResponseHandler createPreparedStatementResponseHandler =
new CreatePreparedStatementResponseHandler(prepareExternalId, userSession, workerProvider, isRequestCancelled);
workerProvider.get().submitWork(prepareExternalId, userSession, createPreparedStatementResponseHandler,
userRequest, TerminationListenerRegistry.NOOP);
return new FlightPreparedStatement(flightDescriptor, query, createPreparedStatementResponseHandler);
}
说明
DremioFlightProducer 在dremio 实现flight service 协议中还是一个比较重要的东西,代码实际上并不是很多,dremio 对于flight 协议实现,实际上不少
下图为当前实现了flight的类
参考资料
https://arrow.apache.org/blog/2022/02/16/introducing-arrow-flight-sql/
https://github.com/apache/arrow/blob/master/java/flight/flight-sql/src/main/java/org/apache/arrow/flight/sql/example/FlightSqlClientDemoApp.java
https://arrow.apache.org/docs/java/reference/index.html
https://github.com/apache/arrow/blob/master/format/Flight.proto