【使用时发生的意外】HDFS 分布式写入问题 AlreadyBeingCreatedException

进行追加文件时出现AlreadyBeingCreatedException错误

堆栈信息大致如下:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to create file [/secsight/log2/666/p0001] for [DFSClient_NONMAPREDUCE_200580206_1756] for client [192.168.10.117], because this file is already being created by [DFSClient_NONMAPREDUCE_-2109133545_2516] on [192.168.10.117]
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3122)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611)
    at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

    at org.apache.hadoop.ipc.Client.call(Client.java:1469)
    at org.apache.hadoop.ipc.Client.call(Client.java:1400)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy9.append(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:313)
    at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy10.append(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1756)
    at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1792)
    at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1785)
    at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:323)
    at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:319)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:319)
    at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163)
    at com.ultrapower.hdfs.HdfsUtils.appendFile(HdfsUtils.java:54)
    at com.ultrapower.secsight.Runner.lambda$main$0(Runner.java:31)
    at java.lang.Thread.run(Thread.java:748)

 

 

 

目前得到的可能原因:

  多进程进行同一文件的写入在HDFS中是可能引发这种错误的。

  hadoop 的dfs里边有个lease manager 维护了文件path -> lease和 DFSClient name -> lease -> path (多个) 的映射关系,我估计是这个lease的问题,看下是不是被close(),而未来的及释放的lease造成的。

 

 

 

 

 

可能引起该错误的代码:https://www.programcreek.com/java-api-examples/index.php?api=org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException

/** Test two consecutive appends on a file with a full block. */
@Test
public void testAppendTwice() throws Exception {
  Configuration conf = new HdfsConfiguration();
  MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).build();
  final FileSystem fs1 = cluster.getFileSystem();
  final FileSystem fs2 = AppendTestUtil.createHdfsWithDifferentUsername(conf);
  try {

    final Path p = new Path("/testAppendTwice/foo");
    final int len = 1 << 16;
    final byte[] fileContents = AppendTestUtil.initBuffer(len);

    {
      // create a new file with a full block.
      FSDataOutputStream out = fs2.create(p, true, 4096, (short)1, len);
      out.write(fileContents, 0, len);
      out.close();
    }

    //1st append does not add any data so that the last block remains full
    //and the last block in INodeFileUnderConstruction is a BlockInfo
    //but not BlockInfoUnderConstruction. 
    fs2.append(p);
    
    //2nd append should get AlreadyBeingCreatedException
    fs1.append(p);
    Assert.fail();
  } catch(RemoteException re) {
    AppendTestUtil.LOG.info("Got an exception:", re);
    Assert.assertEquals(AlreadyBeingCreatedException.class.getName(),
        re.getClassName());
  } finally {
    fs2.close();
    fs1.close();
    cluster.shutdown();
  }
}
 

 

 

 

https://issues.apache.org/jira/browse/HDFS-11367

https://issues.apache.org/jira/browse/HDFS-7203

posted @ 2017-08-30 15:37  Mr.Ming2  阅读(3323)  评论(0编辑  收藏  举报