Я пытаюсь сделать «вставку» в таблицу в формате ORC, я знаю, что сначала мне нужно создать таблицу в формате текстового файла, но когда я вставляю таблицу TextFile, она берет только первую запись, и в ней тысячизаписи.
Вторая проблема: при попытке вставить данные в таблицу ORC TEXTFILE я получаю следующую ошибку.
Как это исправить? Спасибо
файл текстового файла содержит следующее
172199100408438ARP
здесь у меня есть 3 строки
CREATE TABLE table_txt (id string, info string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "(.{3})(.{3}).*")
STORED AS TEXTFILE;
CREATE TABLE table_orc (id string, info string)
STORED AS ORC;
load data inpath '/user/hdfs/textfile.TXT' overwrite into table table_txt;
INSERT OVERWRITE TABLE table_orc SELECT * FROM table_txt;
Мне нужно что-то подобное, когда я делаю запрос
172 199
100 408
438 ARP
но в txt я получаю только
172 199
при условии, что все в порядке, и когда я хочу передать данные текстового файла в ORC, я получаю следующее
hive> INSERT OVERWRITE TABLE table_orc SELECT * FROM table_txt;;
Query ID = homosrv_20191010151919_53e1a477-4086-4cb1-b207-60e7d355ba50
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1570556390990_0067,kill Command = /opt/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p0.3/lib/hadoop/bin/hadoop job -kill job_1570556390990_0067
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2019-10-10 15:19:26,095 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:20:26,847 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:21:27,574 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:22:28,253 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:23:28,876 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:24:29,521 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:25:30,168 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:26:30,775 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:27:31,406 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:28:32,003 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:29:32,574 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:30:33,125 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:31:33,695 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:32:34,296 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:33:34,833 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:34:35,398 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:35:35,929 Stage-1 map = 0%, reduce = 0%
2019-10-10 15:36:08,846 Stage-1 map = 100%, reduce = 0%
Ended Job = job_1570556390990_0067 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1570556390990_0067_m_000000 (and more) from job job_1570556390990_0067
Task with the most failures(4):
-----
Task ID:
task_1570556390990_0067_m_000000
URL:
http:
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:455)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147)
... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:323)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116)
... 22 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2255)
at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:137)
at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:297)
... 24 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec