摘要:
import java.iofrom datetime import datetimefrom org.apache.commons.io import IOUtilsfrom java.nio.charset import StandardCharsetsfrom org.apache.nifi.processor.io import StreamCallbackclass GetDate(St... 阅读全文
摘要:
%matplotlib inline或%matplotlib notebook 阅读全文
摘要:
import org.apache.commons.io.IOUtilsimport java.nio.charset.*import java.text.SimpleDateFormatimport groovy.json.*def flowFile = session.get()flowFile = session.write(flowFile, {inputStream, outStream... 阅读全文
摘要:
su hdfshdfs dfs -chown -R admin / org.apache.hadoop.security.AccessControlException: Permission denied: user=admin, access=WRITE 阅读全文
摘要:
1. 查看当前系统文件夹大小df -h2. 查看目录下各子目录的大小du -sh *或者du -sh /mnt/fdfs_backup/*3. 远程复制文件scplocal_fileremote_ip:remote_folder 4. 复制文件到hdfshadoop fs -put job hdfs://192.168.44.28:8020/ node: -get 是复制hdfs 的... 阅读全文
摘要:
1. hdfs 文件{"retCode":1,"retMsg":"Success","data":[{"secID":"000001.XSHE","ticker":"000001","secShortName":"深发展A","exchangeCD":"XSHE","tradeDate":"1991-10-21",&q 阅读全文
摘要:
在处理指数行情数据时(IDXD),我遇到一个KYLIN性能查询低下的问题,非常奇怪。经过一番研究发现了其中的原因并顺利解决:症状:select count(*) from sensitop.idxd where ticker = ‘000300’ and tradedate between ‘2016-01-01’ and ‘2016-07-01'很快,不到一秒select * from sens... 阅读全文
摘要:
问题排查方式一般的错误,查看错误输出,按照关键字google异常错误(如namenode、datanode莫名其妙挂了):查看hadoop($HADOOP_HOME/logs)或hive日志hadoop错误1.datanode无法正常启动添加datanode后,datanode无法正常启动,进程一会莫名其妙挂掉,查看namenode日志显示如下:Text代码2013-06-2118:... 阅读全文
摘要:
HDFS-Could not obtain blockMapReduceTotal cumulative CPU time: 33 seconds 380 msecEnded Job = job_201308291142_4635 with errorsErrorduring job, obtaining debugging information...Job Tracking URL:h... 阅读全文
摘要:
全景图:1. 创建hive表CREATE TABLE IF NOT EXISTS newsinfo.test( name STRING)CLUSTERED BY (name)INTO 3 BUCKETSROW FORMAT DELIMITEDSTORED AS ORCTBLPROPERTIES('transactional'='true');2. 这里用了 ReplaceText生成 js... 阅读全文