dremio node节点统计信息显示问题简单说明
可能有人注意到dremio 管理界面的统计信息为N/A ,而且因为web 端进行了一些额外显示处理,造成一些疑惑
界面效果
接口返回数据信息
目前从官方代码来说,此显示是正常的,因为计算的是百分比,而且当系统负载比较低的时候,结果基本就为0
web 处理
NodeActivityView.js
- 参考代码
[port]: {
node: () => (node.get("port") !== -1 ? node.get("port") : "N/A"),
},
[cpu]: {
node: () =>
node.get("cpu") !== 0
? `${NumberFormatUtils.roundNumberField(node.get("cpu"))}%`
: "N/A",
},
[memory]: {
node: () =>
node.get("memory") !== 0
? `${NumberFormatUtils.roundNumberField(node.get("memory"))}%`
: "N/A", // todo: check comps for digits. and fix so no need for parseFloat
},
[version]: {
node: () => node.get("version") || "-",
},
后端处理代码
- api 接口 (SystemResource )
@GET
@Path("/nodes")
@Produces(MediaType.APPLICATION_JSON)
public List<NodeInfo> getNodes(){
final List<NodeInfo> result = new ArrayList<>();
final Map<String, NodeEndpoint> execMap = new HashMap<>();
final Map<String, NodeEndpoint> coordMap = new HashMap<>();
// first get the coordinator nodes (in case there are no executors running)
for(NodeEndpoint ep : context.get().getCoordinators()){
coordMap.put(ep.getAddress() + ":" + ep.getFabricPort(), ep);
}
// try to get any executor nodes, but don't throw a UserException if we can't find any
try {
NodeStatsListener nodeStatsListener = new NodeStatsListener(context.get().getExecutors().size());
context.get().getExecutors().forEach(
ep -> {
// 调用executorServiceClient 进行rpc 调用处理
executorServiceClientFactoryProvider.get().getClientForEndpoint(ep).getNodeStats(Empty.newBuilder().build(),
nodeStatsListener);
}
);
try {
nodeStatsListener.waitForFinish();
} catch (Exception ex) {
logger.warn("Error while collecting node statistics: {}", ex.getMessage());
}
ConcurrentHashMap<String, NodeInstance> nodeStats = nodeStatsListener.getResult();
for (NodeEndpoint ep : context.get().getExecutors()) {
execMap.put(ep.getAddress() + ":" + ep.getFabricPort(), ep);
}
for (Map.Entry<String, NodeInstance> statsEntry : nodeStats.entrySet()) {
NodeInstance stat = statsEntry.getValue();
NodeEndpoint ep = execMap.remove(statsEntry.getKey());
coordMap.remove(statsEntry.getKey());
if (ep == null) {
logger.warn("Unable to find node with identity: {}", statsEntry.getKey());
continue;
}
result.add(NodeInfo.fromNodeInstance(stat));
}
} catch (UserException e) {
logger.warn(e.getMessage());
}
final List<NodeInfo> finalList = new ArrayList<>();
final List<NodeInfo> coord = new ArrayList<>();
for (NodeEndpoint ep : coordMap.values()){
// response 数据转换
final NodeInfo nodeInfo = NodeInfo.fromEndpoint(ep);
if (nodeInfo.getIsMaster()) {
finalList.add(nodeInfo);
} else {
coord.add(nodeInfo);
}
}
final List<NodeInfo> failedNodes = new ArrayList<>();
for (NodeEndpoint ep : execMap.values()){
final NodeInfo nodeInfo = NodeInfo.fromUnresponsiveEndpoint(ep);
failedNodes.add(nodeInfo);
}
// put coordinators first.
finalList.addAll(coord);
finalList.addAll(result);
finalList.addAll(failedNodes);
return finalList;
}
ExecutorServiceImpl 类 (executorService server 实现)
- 参考代码
public static CoordExecRPC.NodeStats getNodeStatsFromContext(SabotContext context) {
final ThreadsIterator threads = new ThreadsIterator(context, null);
final MemoryIterator memoryIterator = new MemoryIterator(context, null);
final WorkStats stats = context.getWorkStatsProvider().get();
final CoordinationProtos.NodeEndpoint ep = context.getEndpoint();
final double load = stats.getClusterLoad();
final int configuredMaxWidth = (int) context.getClusterResourceInformation().getAverageExecutorCores(context.getOptionManager());
final int actualMaxWidth = (int) Math.max(1, configuredMaxWidth * stats.getMaxWidthFactor());
// 默认为0,计算的是百分比
double memory = 0;
double cpu = 0;
// get cpu
while(threads.hasNext()) {
ThreadsIterator.ThreadSummary summary = (ThreadsIterator.ThreadSummary) threads.next();
double cpuTime = summary.cpu_time == null ? 0 : summary.cpu_time;
double numCores = summary.cores;
cpu += (cpuTime / numCores);
}
// get memory
if(memoryIterator.hasNext()) {
MemoryIterator.MemoryInfo memoryInfo = ((MemoryIterator.MemoryInfo) memoryIterator.next());
memory = memoryInfo.direct_current * 100.0 / memoryInfo.direct_max;
}
String ip = null;
try {
ip = InetAddress.getLocalHost().getHostAddress();
} catch (UnknownHostException e) {
// no op
}
return CoordExecRPC.NodeStats.newBuilder()
.setCpu(cpu)
.setMemory(memory)
.setVersion(DremioVersionInfo.getVersion())
.setPort(ep.getFabricPort())
.setName(ep.getAddress())
.setIp(ip)
.setStatus("green")
.setLoad(load)
.setConfiguredMaxWidth(configuredMaxWidth)
.setActualMaxWith(actualMaxWidth)
.setCurrent(false)
.build();
}
- 接口port 为-1 的问题
如下,因为userpoprt 就是为-1(使用随机端口),fabricport 是确定的
接口返回数据的处理
Nodes 类
// 获取是是userport 就是-1,所以界面显示就就是N/A 了
public static NodeInfo fromEndpoint(CoordinationProtos.NodeEndpoint endpoint) {
final boolean master = endpoint.getRoles().getMaster();
final boolean coord = endpoint.getRoles().getSqlQuery();
final boolean exec = endpoint.getRoles().getJavaExecutor();
boolean isCompatible = isCompatibleVersion(endpoint.getDremioVersion());
return new NodeInfo(
endpoint.getAddress(),
endpoint.getAddress(),
endpoint.getAddress(),
endpoint.getUserPort(),
0d,
0d,
"green",
master,
coord,
exec,
isCompatible,
endpoint.getNodeTag(),
endpoint.getDremioVersion(),
endpoint.getStartTime(),
isCompatible ? NodeDetails.NONE.toMessage(null) : NodeDetails.INVALID_VERSION.toMessage(endpoint.getDremioVersion())
);
}
说明
dremio 对于一些信息缺少说明,结合源码查看是比较好的
参考资料
dac/ui/src/pages/AdminPage/subpages/NodeActivity/NodeActivityView.js
dac/backend/src/main/java/com/dremio/dac/resource/SystemResource.java
services/executorservice/src/main/java/com/dremio/service/executor/ExecutorServiceClient.java
sabot/kernel/src/main/java/com/dremio/exec/service/executor/ExecutorServiceProductClient.java
sabot/kernel/src/main/java/com/dremio/sabot/rpc/CoordExecService.java
sabot/kernel/src/main/java/com/dremio/exec/service/executor/ExecutorServiceImpl.java
sabot/kernel/src/main/java/com/dremio/exec/store/sys/ThreadsIterator.java
sabot/kernel/src/main/java/com/dremio/exec/store/sys/MemoryIterator.java
common/legacy/src/main/java/com/dremio/common/VM.java
sabot/kernel/src/main/java/com/dremio/exec/server/ContextService.java
dac/backend/src/main/java/com/dremio/dac/model/system/Nodes.java