Hive: ошибка выполнения, когда условие «где» содержит подзапрос - PullRequest
0 голосов
/ 07 февраля 2020

У меня есть две таблицы. Таблица 1 - большая, а Таблица 2 - маленькая. Я хотел бы извлечь данные из таблицы 1, если значения в Table1.column1 совпадают со значениями в Table2.column1. И таблица 1, и таблица 2 имеют столбец column1. Вот мой код.

select *
from Table1
where condition1
and condition2
and column1 in (select column1 from Table2)

Условие 1 и Условие 2 предназначены для ограничения размера извлекаемой таблицы. Не уверен, что это на самом деле работает. Тогда я получил execution error, return code 1. Я на платформе Хюэ.

РЕДАКТИРОВАТЬ

Как подсказал @yammanuruarun, я попробовал следующий код.

SELECT *
FROM
  (SELECT *
   FROM Table1
   WHERE condition1
     AND condition2) t1
INNER JOIN Table2 ON t1.column1 = t2.column1

Затем я получил следующая ошибка.

Error while processing statement: FAILED: Execution Error, return code 2 from 

org.apache.hadoop.hive.ql.exec.tez.TezTask. Application 

application_1580875150091_97539 failed 2 times due to AM Container for 

appattempt_1580875150091_97539_000002 exited with exitCode: 255 Failing this 

attempt.Diagnostics: [2020-02-07 14:35:53.944]Exception from container-launch.

Container id: container_e1237_1580875150091_97539_02_000001 Exit code: 255

Exception message: Launch container failed Shell output: main : command provided 1

 main : run as user is hive main : requested yarn user is hive Getting exit code

 file... Creating script paths... Writing pid file... Writing to tmp file /disk-
11/hadoop/yarn/local/nmPrivate/application_1580875150091_97539/container_e1237_1580875150091_97539_02_000001/container_e1237_1580875150091_97539_02_000001.pid.tmp

Writing to cgroup task files... Creating local dirs... Launching container... 

Getting exit code file... Creating script paths... [2020-02-07 14:35:53.967]Container exited with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : 

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in 

thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of 

Memory Error... Halting due to Out Of Memory Error... Halting due to Out Of Memory 
Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... [2020-02-07 14:35:53.967]Container exited

 with a non-zero exit code 255. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : 

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in 

thread "IPC Server idle connection scanner for port 26888" Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error...

 Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

Halting due to Out Of Memory Error... Halting due to Out Of Memory Error... 

For more detailed output, check the application tracking page: http://dcwipphm12002.edc.nam.gm.com:8088/cluster/app/application_1580875150091_97539 Then click on links to logs of each attempt. . Failing the application.

Похоже, это ошибка памяти. Можно ли как-нибудь оптимизировать мой запрос?

...