Hive问题汇总

Hive问题汇总

问题描述:启动Hive时报错(hive2.3.9)

错误日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[root@node02 hive]# hive 
which: no hbase in (/opt/hive/bin:/opt/hadoop-3.2.1/bin:/opt/hadoop-3.2.1/sbin:/usr/java/jdk1.8.0_162/bin:/opt/hadoop-3.2.1/bin:/opt/hadoop-3.2.1/sbin:/usr/java/jdk1.8.0_162/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)
at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:448)
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4051)
at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:4009)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:81)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:65)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:702)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

原因分析

错误一:which: no hbase in
原因:缺少HBase相关环境,但是这个不影响Hive启动,暂时忽略。
错误二:NoSuchMethodError: com.google.common.base.Preconditions.checkArgument
原因:Hadoop版本下的guava.jar与Hive版本下的guava.jar版本冲突。

解决方案

将guava.jar全部更换为高版本即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
## 查看hadoop中guava.jar版本
[root@node02 hive]# ls /opt/hadoop-3.2.1/share/hadoop/common/lib/ |grep guava
guava-27.0-jre.jar
listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
## 查看hive中guava.jar版本
[root@node02 hive]# ls lib/ |grep guava
guava-14.0.1.jar
jackson-datatype-guava-2.4.6.jar
## 删除低版本并替换高版本
[root@node02 hive]# rm -rf lib/guava-14.0.1.jar
[root@node02 hive]# mv /opt/hadoop-3.2.1/share/hadoop/common/lib/guava-27.0-jre.jar lib/
[root@node02 hive]# ls lib/ |grep guava
guava-27.0-jre.jar
jackson-datatype-guava-2.4.6.jar

HIVE新增表在Impala中不能及时查看问题

这里基于CDH部署安装的HIVE和Impala,当我们在HIVE中新增一张表时,我们发现在Impala中不能及时查看到新增表,这是因为元数据信息没有及时刷新,在impala里执行invalidate metadata命令刷新元数据即可。
这里我们简单测试
HIVE中新建一张表并查看

1
2
3
4
5
6
7
CREATE TABLE test_01(
id INT,name STRING,sex STRING,dataday STRING);
--HIVE中查看新增表
hive> show tables;
OK
test_01
test_table

Impala中查看相关信息,此时并没有新增表

1
2
3
4
5
+--------------------------------------+
| name |
+--------------------------------------+
| test_table |
+--------------------------------------+

执行invalidate metadata命令后再次查看

1
2
3
4
5
6
7
8
9
[nd3:21000] default> invalidate metadata;
--Impala中再次查看表信息
[nd3:21000] default> show tables;
+--------------------------------------+
| name |
+--------------------------------------+
| test_01 |
| test_table |
+--------------------------------------+

符合预期!!!

补充说明

我们可以使用refresh tableName来刷新指定的表。

1
2
3
[nd2:21000] default> refresh test_table;
Query: refresh test_table
Fetched 0 row(s) in 0.05s

Hive动态分区开启与关闭

默认情况下,Hive中的动态分区是关闭的。
当我们使用动态分区时,需要先打开动态分区。否则会报错,错误如下:

1
2
FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. 
To turn this off set hive.exec.dynamic.partition.mode=nonstrict

根据提示信息打开动态分区

1
2
set hive.exec.dynamici.partition=true;  #开启动态分区,默认是false 
set hive.exec.dynamic.partition.mode=nonstrict; #开启允许所有分区都是动态的,否则必须要有静态分区才能使用,默认strict

基于CDH配置

我们可以直接在CDH WEB UI界面进行Hive动态分区配置。
打开Hive-配置,搜索hive-site,找到hive-site.xml 的Hive 客户端高级配置代码段(安全阀),新增配置

1
2
3
4
5
6
7
名称  hive.exec.dynamici.partition
值 true
说明 开启动态分区

名称 hive.exec.dynamic.partition.mode
值 nonstrict
说明 允许所有分区都是动态的

保存并重新部署客户端配置即可。

substr(substring)在Hive和Impala中的区别

在Hive中,substr和substring函数等效,都用于截取字符串。并且起始位置,可以从0或者从1开始,查询出来的结果一致。

1
2
3
4
5
6
hive> select substr('2020-01-01 10:30:00',1,10);
2020-01-01
hive> select substring('2020-01-01 10:30:00',1,10);
2020-01-01
hive> select substring('2020-01-01 10:30:00',0,10);
2020-01-01

同样,在Impala中,substr和substring函数等效,都用于截取字符串。需要注意的是,impala中是从1开始。》
最坑的地方在于,如果从0开始,它不会报错,但是查询出来的结果会与预期不一样。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[bd14:21000] devp_src> select substr('2020-01-01 10:30:00',1,10);
+--------------------------------------+
| substr('2020-01-01 10:30:00', 1, 10) |
+--------------------------------------+
| 2020-01-01                           |
+--------------------------------------+
[bd14:21000] devp_src> select substring('2020-01-01 10:30:00',1,10);
+-----------------------------------------+
| substring('2020-01-01 10:30:00', 1, 10) |
+-----------------------------------------+
| 2020-01-01                              |
+-----------------------------------------+
[bd14:21000] devp_src> select substring('2020-01-01 10:30:00',0,10);
+-----------------------------------------+
| substring('2020-01-01 10:30:00', 0, 10) |
+-----------------------------------------+
|                                         |
+-----------------------------------------+

总结:建议不管是在Hive还是在Impala中,使用substr(substring)函数截取字符串的时候,都从1开始,不要从0开始!

FAILED: Execution Error, return code 2

错误日志

执行SQL时报错:

1
2
3
4
5
6
7
Starting Job = job_1675993570983_0004, Tracking URL = http://node03:8088/proxy/application_1675993570983_0004/
Kill Command = /opt/hadoop-3.2.1/bin/mapred job -kill job_1675993570983_0004
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2023-02-13 21:06:24,637 Stage-1 map = 0%, reduce = 0%
Ended Job = job_1675993570983_0004 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

根据URL跳转查看具体错误信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Application application_1675993570983_0004 failed 2 times due to AM Container for appattempt_1675993570983_0004_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2023-02-13 21:06:23.781]Exception from container-launch.
Container id: container_1675993570983_0004_02_000001
Exit code: 1

[2023-02-13 21:06:23.787]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

解决方案

根据提示配置集群文件mapred-site.xml,重启集群。

1
2
3
4
5
6
7
8
9
10
11
12
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
打赏
  • 版权声明: 本博客所有文章除特别声明外,著作权归作者所有。转载请注明出处!
  • Copyrights © 2015-2023 henrrywan

请我喝杯咖啡吧~

支付宝
微信