Hive Query вызывает семантическое исключение, если разделы, в которых условие превышает некоторое число (в моем случае 30), и успешно, если число разделов меньше 30 - PullRequest
0 голосов
/ 26 июня 2019

Я выполняю запрос улья и получаю следующее исключение

.HiveSQLException: Error while compiling statement: FAILED: SemanticException MetaException(message:The arguments for IN should be the same type!Types are: {struct<col1:string,col2:string,col3:string> IN (struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>..{No.of partitions I give here}

Как всегда выполняется один и тот же запрос, если я уменьшу количество разделов в состоянии условия. Я не уверен, где он терпит неудачу, но одну вещь, которую я понял, это проверка информации о разделах в has metastore и сбой в некоторых моментах.

Моя таблица разбита на год, месяц и день со строковым типом данных. Есть идеи, почему это происходит?

Я пытался изменить количество разделов, которые будут использоваться в условии

Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException MetaException(message:The arguments for IN should be the same type! Types are: {struct<col1:string,col2:string,col3:string> IN (struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>, struct<col1:int,col2:int,col3:int>)})
    at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
    at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:112)
    at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)

Запрос:

SELECT

`id` as `id`,
`source_id` as `source_id`,
`source` as `source`,
`created_at` as `created_at`,
`updated_at` as `updated_at`,
`deleted_at` as `deleted_at`,
`is_done` as `is_done` ,year,month,day
    FROM
      (SELECT RANK() OVER w AS rnk,
                          ROW_NUMBER() OVER w AS row_num,
                                            t1.*
       FROM
         (
              SELECT *
              FROM `test.metrics`
               WHERE ((YEAR = 2019 AND MONTH = 5 AND DAY=21) OR (YEAR = 2019 AND MONTH = 5 AND DAY=22) OR (YEAR = 2019 AND MONTH = 5 AND DAY=23) OR (YEAR = 2019 AND MONTH = 5 AND DAY=24) OR (YEAR = 2019 AND MONTH = 5 AND DAY=20) OR (YEAR = 2019 AND MONTH = 5 AND DAY=18) OR (YEAR = 2019 AND MONTH = 5 AND DAY=19) OR (YEAR = 2019 AND MONTH = 5 AND DAY=25) OR (YEAR = 2019 AND MONTH = 5 AND DAY=26) OR (YEAR = 2019 AND MONTH = 5 AND DAY=16) OR (YEAR = 2019 AND MONTH = 5 AND DAY=17))
        UNION ALL
          SELECT
cast( data["id"]  as INT) as `id`,
cast( data["source_id"]  as INT) as `source_id`,
data["source"]  as `source`,
data["created_at"]  as `created_at`,
data["updated_at"]  as `updated_at`,
data["deleted_at"]  as `deleted_at`,
cast( data["is_done"]  as SMALLINT) as `is_done`
              ,YEAR(data["created_at"] ) as `year`,
MONTH(data["created_at"] ) as `month`,
DAY(data["created_at"] ) as `day`
              FROM temp.metrics
               WHERE ((YEAR = 2019 AND MONTH = 5 AND DAY=9) OR (YEAR = 2019 AND MONTH = 5 AND DAY=8) OR (YEAR = 2019 AND MONTH = 5 AND DAY=7) OR (YEAR = 2019 AND MONTH = 5 AND DAY=6) OR (YEAR = 2019 AND MONTH = 5 AND DAY=5) OR (YEAR = 2019 AND MONTH = 4 AND DAY=26) OR (YEAR = 2019 AND MONTH = 4 AND DAY=27) OR (YEAR = 2019 AND MONTH = 4 AND DAY=25) OR (YEAR = 2019 AND MONTH = 4 AND DAY=28) OR (YEAR = 2019 AND MONTH = 4 AND DAY=29) OR (YEAR = 2019 AND MONTH = 5 AND DAY=10) OR (YEAR = 2019 AND MONTH = 5 AND DAY=11) OR (YEAR = 2019 AND MONTH = 5 AND DAY=12) OR (YEAR = 2019 AND MONTH = 5 AND DAY=13) OR (YEAR = 2019 AND MONTH = 5 AND DAY=18) OR (YEAR = 2019 AND MONTH = 5 AND DAY=19) OR (YEAR = 2019 AND MONTH = 5 AND DAY=14) OR (YEAR = 2019 AND MONTH = 5 AND DAY=15) OR (YEAR = 2019 AND MONTH = 5 AND DAY=16) OR (YEAR = 2019 AND MONTH = 5 AND DAY=17) OR (YEAR = 2019 AND MONTH = 4 AND DAY=30) OR (YEAR = 2019 AND MONTH = 5 AND DAY=21) OR (YEAR = 2019 AND MONTH = 5 AND DAY=22) OR (YEAR = 2019 AND MONTH = 5 AND DAY=23) OR (YEAR = 2019 AND MONTH = 5 AND DAY=24) OR (YEAR = 2019 AND MONTH = 5 AND DAY=20) OR (YEAR = 2019 AND MONTH = 5 AND DAY=4) OR (YEAR = 2019 AND MONTH = 5 AND DAY=3) OR (YEAR = 2019 AND MONTH = 5 AND DAY=2) OR (YEAR = 2019 AND MONTH = 5 AND DAY=1) OR (YEAR = 2019 AND MONTH = 5 AND DAY=25) OR (YEAR = 2019 AND MONTH = 5 AND DAY=26))
          ) t1
          WINDOW w AS  (PARTITION BY id
                                            ORDER BY updated_at DESC)
      )t
    WHERE rnk =1 AND row_num =1  AND ((YEAR = 2019 AND MONTH = 5 AND DAY=21) OR (YEAR = 2019 AND MONTH = 5 AND DAY=22) OR (YEAR = 2019 AND MONTH = 5 AND DAY=23) OR (YEAR = 2019 AND MONTH = 5 AND DAY=24) OR (YEAR = 2019 AND MONTH = 5 AND DAY=20) OR (YEAR = 2019 AND MONTH = 5 AND DAY=18) OR (YEAR = 2019 AND MONTH = 5 AND DAY=19) OR (YEAR = 2019 AND MONTH = 5 AND DAY=25) OR (YEAR = 2019 AND MONTH = 5 AND DAY=26) OR (YEAR = 2019 AND MONTH = 5 AND DAY=16) OR (YEAR = 2019 AND MONTH = 5 AND DAY=17))  DISTRIBUTE BY year,month,day
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...