调用Nacos-client报错:failed to create cache dir

 

问题背景:

  我们的项目要用Gateway实现对微服务的分流,就是控制流量打到一个微服务的不同实例的比例,所以在geteway里写了很多调用Nacos的API的方法。

  在部署新环境的时候,报了以下错误,我们的服务器使用的是k8s,镜像都是统一的。

2021-11-23 16:53:54.568 ERROR [***-gateway,,,] 1 --- [           main] com.alibaba.nacos.client.naming          : [NA] failed to write cache for dom:DEFAULT_GROUP@@***-****

java.lang.IllegalStateException: failed to create cache dir: /root/nacos/naming/753378b3-d4ad-4f1a-859b-f9d57df33c9f
    at com.alibaba.nacos.client.naming.cache.DiskCache.makeSureCacheDirExists(DiskCache.java:154) ~[nacos-client-1.1.4.jar:na]
    at com.alibaba.nacos.client.naming.cache.DiskCache.write(DiskCache.java:45) ~[nacos-client-1.1.4.jar:na]
    at com.alibaba.nacos.client.naming.core.HostReactor.processServiceJSON(HostReactor.java:184) [nacos-client-1.1.4.jar:na]

 

问题排查过程:

  错误内容很明显,就是要往服务器里写入缓存文件,失败了。通过错误提示,我们在 nacos-client-1.1.4.jar 里找到了报错的类

package com.alibaba.nacos.client.naming.cache;

public class DiskCache {

    public static void write(ServiceInfo dom, String dir) {
        try {
            makeSureCacheDirExists(dir);
            File file = new File(dir, dom.getKeyEncoded());
            if (!file.exists()) {
                // add another !file.exists() to avoid conflicted creating-new-file from multi-instances
                if (!file.createNewFile() && !file.exists()) {
                    throw new IllegalStateException("failed to create cache file");
                }
            }
            StringBuilder keyContentBuffer = new StringBuilder("");
            String json = dom.getJsonFromServer();
            if (StringUtils.isEmpty(json)) {
                json = JSON.toJSONString(dom);
            }
            keyContentBuffer.append(json);
            //Use the concurrent API to ensure the consistency.
            ConcurrentDiskUtil.writeFileContent(file, keyContentBuffer.toString(), Charset.defaultCharset().toString());
        } catch (Throwable e) {
            NAMING_LOGGER.error("[NA] failed to write cache for dom:" + dom.getName(), e);
        }
    }

    *******

  private static File makeSureCacheDirExists(String dir) { File cacheDir = new File(dir); if (!cacheDir.exists() && !cacheDir.mkdirs()) { throw new IllegalStateException("failed to create cache dir: " + dir); } return cacheDir; } }

  write方法调用了makeSureCacheDirExists,在makeSureCacheDirExists方法里,判断缓存文件不存在,并且创建目录失败了,就会抛出异常。

  我们通过调动关系,要找到谁调用了DiskCache的write方法,我找到了HostReactor,缓存地址cacheDir是通过构造方法传进来的。

package com.alibaba.nacos.client.naming.core;
public class HostReactor {
  
  public HostReactor(EventDispatcher eventDispatcher, NamingProxy serverProxy, String cacheDir, boolean loadCacheAtStart, int pollingThreadCount) {
    ......
}
}

  再往前找,发现是 NacosNamingService 实例化的时候,调用了 HostReactor

package com.alibaba.nacos.client.naming;

@SuppressWarnings("PMD.ServiceOrDaoClassShouldEndWithImplRule")
public class NacosNamingService implements NamingService {
  private HostReactor hostReactor;

  public NacosNamingService(String serverList) {
        Properties properties = new Properties();
        properties.setProperty(PropertyKeyConst.SERVER_ADDR, serverList);

        init(properties);
    }

    public NacosNamingService(Properties properties) {
        init(properties);
    }

    private void init(Properties properties) {
        namespace = InitUtils.initNamespaceForNaming(properties);
        initServerAddr(properties);
        InitUtils.initWebRootContext();
        initCacheDir();
        initLogName(properties);

        eventDispatcher = new EventDispatcher();
        serverProxy = new NamingProxy(namespace, endpoint, serverList);
        serverProxy.setProperties(properties);
        beatReactor = new BeatReactor(serverProxy, initClientBeatThreadCount(properties));
        hostReactor = new HostReactor(eventDispatcher, serverProxy, cacheDir, isLoadCacheAtStart(properties), initPollingThreadCount(properties));
    }
  
private void initCacheDir() { cacheDir = System.getProperty("com.alibaba.nacos.naming.cache.dir"); if (StringUtils.isEmpty(cacheDir)) { cacheDir = System.getProperty("user.home") + "/nacos/naming/" + namespace; } } ...... }

  NacosNamingService 的构造方法都调用了init方法,而init方法调用了initCacheDir()方法,给cacheDir变量赋值,最后完成了HostReactor 类的初始化。

  当看到 initCacheDir 方法的内容后,大家应该就都明白了,指定Nacos缓存路径有2种方式:

  1. 在项目配置文件中指定,参数:com.alibaba.nacos.naming.cache.dir

  2. 服务器的运行用户的根目录 + /nacos/naming/

  

解决方法:

  1. 如果服务器上的只有root账号,可以尝试让运维同学把 /root/nacos/naming/ 目录的写入权限放开

  2. 一般情况root的目录是禁止随便写入的,那可以更换服务器上的其他账号,启动应用程序,并开放 /user/nacos/naming/ 目录的写入权限

  3. 在程序的yml文件中配置 com.alibaba.nacos.naming.cache.dir ,把缓存写到一个开放的文件目录。

 

posted @ 2021-11-23 18:01  闲人鹤  阅读(3472)  评论(0编辑  收藏  举报