《Hive编程指南》14.3 投影变换的实践出错原因分析

自己在学习14.3节投影变换执行SQL语句hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA, newB FROM a;时出现了这个错误

Ended Job = job_local1231989520_0004 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

跟作者的输出不一样。
自己一开始时没有管这个错误,直接跳过这个问题,继续往下看了。
但在执行接下来的语句SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA FROM a;时又出现了这个错误

Ended Job = job_local1771018374_0006 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

没有明确的错误信息,自己先是另开了一个终端尝试了一下:

duxing@duxing-X550JK:~$ /bin/cut
bash: /bin/cut: No such file or directory
duxing@duxing-X550JK:~$ echo "4 5" | /bin/cut -f1
bash: /bin/cut: No such file or directory
duxing@duxing-X550JK:~$ echo "4 5" |cut -f1
4 5
duxing@duxing-X550JK:~$ echo "4 5" |cut -f1
4 5

第一次没有找的出错原因。
又在终端尝试了一下,在执行这行语句的时候SELECT TRANSFORM(col1, col2) USING 'cut -f1' AS newA, newB FROM a;得到了预期的结果,同时我又执行ls /bin | grep "cut"并没有cut文件。
现在看来是因为Ubuntu中cut程序并没有放在/bin/目录下导致语句执行出错。
在执行which cut后得到验证,cut放在了/usr/bin/目录下。
相关命令执行记录

  1. Hive终端
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA, newB FROM a;
Ended Job = job_local1231989520_0004 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA, newB FROM a;
Ended Job = job_local1383279613_0005 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA FROM a;
Ended Job = job_local1771018374_0006 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/cut -f1' AS newA FROM a;
Ended Job = job_local81582517_0007 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/sed s/4/10' AS newA, newB AS a;
NoViableAltException(37@[])
	at org.apache.hadoop.hive.ql.parse.HiveParser.rowFormat(HiveParser.java:34626)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectTrfmClause(HiveParser_SelectClauseParser.java:2021)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1216)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:51850)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:45661)
	at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45568)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44584)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:44454)
	at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1696)
	at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1178)
	at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204)
	at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:444)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1242)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1384)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:67 cannot recognize input near 'AS' 'a' '<EOF>' in serde specification
hive (default)> SELECT TRANSFORM(col1, col2) USING '/bin/sed s/4/10' AS newA, newB FROM a;
/bin/sed: -e expression #1, char 6: unterminated `s' command
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
	at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:585)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
	at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:585)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
	at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:585)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:697)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Ended Job = job_local1180910273_0008 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hive (default)> SELECT TRANSFORM(col1, col2) USING 'cut -f1' AS newA, newB FROM a;
newA	newB
4	NULL
3	NULL
hive (default)> SELECT TRANSFORM(col1, col2) USING 'cut -f1' AS newA FROM a;
newA
4
3

终端1:

duxing@duxing-X550JK:~$ /bin/cut
bash: /bin/cut: No such file or directory
duxing@duxing-X550JK:~$ echo "4 5" | /bin/cut -f1
bash: /bin/cut: No such file or directory
duxing@duxing-X550JK:~$ echo "4 5" |cut -f1
4 5
duxing@duxing-X550JK:~$ echo "4 5" |cut -f1
4 5
duxing@duxing-X550JK:~$ echo "4 5" |sed s/4/10
sed: -e expression #1, char 6: unterminated `s' command
duxing@duxing-X550JK:~$ sed s/4/10
sed: -e expression #1, char 6: unterminated `s' command

终端2:

duxing@duxing-X550JK:~$ ls /bin | grep "cut"
duxing@duxing-X550JK:~$ ls /bin
bash          bzmore          dd             fgrep       kbd_mode    ls          nc                ntfsfallocate  ps         sh                    systemd-inhibit                 uname           zfgrep
bunzip2       cat             df             findmnt     kill        lsblk       nc.openbsd        ntfsfix        pwd        sh.distrib            systemd-machine-id-setup        uncompress      zforce
busybox       cgroups-mount   dir            fuser       kmod        lsmod       netcat            ntfsinfo       rbash      sleep                 systemd-notify                  unicode_start   zgrep
bzcat         cgroups-umount  dmesg          fusermount  less        mkdir       netstat           ntfsls         readlink   ss                    systemd-tmpfiles                vdir            zless
bzcmp         chacl           dnsdomainname  getfacl     lessecho    mknod       networkctl        ntfsmove       red        static-sh             systemd-tty-ask-password-agent  vmmouse_detect  zmore
bzdiff        chgrp           domainname     grep        lessfile    mktemp      nisdomainname     ntfstruncate   rm         stty                  tailf                           wdctl           znew
bzegrep       chmod           dumpkeys       gunzip      lesskey     more        ntfs-3g           ntfswipe       rmdir      su                    tar                             which
bzexe         chown           echo           gzexe       lesspipe    mount       ntfs-3g.probe     open           rnano      sync                  tempfile                        whiptail
bzfgrep       chvt            ed             gzip        ln          mountpoint  ntfs-3g.secaudit  openvt         run-parts  systemctl             touch                           ypdomainname
bzgrep        cp              efibootmgr     hciconfig   loadkeys    mt          ntfs-3g.usermap   pidof          sed        systemd               true                            zcat
bzip2         cpio            egrep          hostname    login       mt-gnu      ntfscat           ping           setfacl    systemd-ask-password  udevadm                         zcmp
bzip2recover  dash            false          ip          loginctl    mv          ntfscluster       ping6          setfont    systemd-escape        ulockmgr_server                 zdiff
bzless        date            fgconsole      journalctl  lowntfs-3g  nano        ntfscmp           plymouth       setupcon   systemd-hwdb          umount                          zegrep
duxing@duxing-X550JK:~$ which cut
/usr/bin/cut
duxing@duxing-X550JK:~$ /usr/bin/cut
/usr/bin/cut: you must specify a list of bytes, characters, or fields
Try '/usr/bin/cut --help' for more information.

补充:在写本文的时候发现其实在终端执行/bin/cut的时候已经可以得到出错原因了,bash: /bin/cut: No such file or directory已经提示了/bin/cut不存在,而执行/usr/bin/cut的提示是没有参数。

posted @ 2018-05-03 22:48  DataNerd  阅读(1060)  评论(0编辑  收藏  举报