BigQuery: ПРИСОЕДИНЯЙТЕСЬ К ВКЛЮЧЕНИЮ с повторяющимся / массивом поля STRUCT в стандартном SQL? - PullRequest
0 голосов
/ 02 июля 2018

У меня в основном две таблицы, Orders и Items. Поскольку эти таблицы импортируются из файлов резервных копий Google Cloud Datastore, ссылки создаются не простым полем идентификатора, а <STRUCT> для отношения один-к-одному, где его поле id представляет фактический уникальный идентификатор, который я хочу сопоставить , Для отношения один ко многим (REPEATED) схема использует ARRAY <STRUCT>.

Я могу запросить отношения один-к-одному с помощью LEFT OUTER JOIN, я также знаю, как соединиться с неповторяющейся структурой и повторяющейся строкой или int, но у меня возникают проблемы с получением аналогичного запроса соединения с повторная структура .

Один заказ с одним элементом :

#standardSQL
WITH Orders AS (
  SELECT 1 AS __oid__, STRUCT(STRUCT(2 AS id, "default" AS ns) AS key) AS item UNION ALL 
  SELECT 2 AS __oid__, STRUCT(STRUCT(4 AS id, "default" AS ns) AS key) AS item UNION ALL 
  SELECT 3 AS __oid__, STRUCT(STRUCT(6 AS id, "default" AS ns) AS key) AS item
),
Items AS (
  SELECT STRUCT(1 AS id, "default" AS ns) AS key, "#1.1" AS title UNION ALL
  SELECT STRUCT(2 AS id, "default" AS ns) AS key, "#1.2" AS title UNION ALL
  SELECT STRUCT(3 AS id, "default" AS ns) AS key, "#1.3" AS title UNION ALL
  SELECT STRUCT(4 AS id, "default" AS ns) AS key, "#1.4" AS title UNION ALL
  SELECT STRUCT(5 AS id, "default" AS ns) AS key, "#1.5" AS title UNION ALL
  SELECT STRUCT(6 AS id, "default" AS ns) AS key, "#1.6" AS title
)

SELECT
   __oid__
  ,Order_item AS item
FROM Orders  

LEFT OUTER JOIN(
  SELECT
     key
    ,title
  FROM Items
) Order_item
ON Order_item.key.id = item.key.id

Результат (работает как положено):

+-----+---------+--------------+-------------+------------+
| Row | __oid__ |  item.key.id | item.key.ns | item.title |
+-----+---------+--------------+-------------+------------+
|   1 |       1 |            2 |     default |       #1.2 |
+-----+---------+--------------+-------------+------------+
|   2 |       2 |            4 |     default |       #1.4 |
+-----+---------+--------------+-------------+------------+
|   3 |       3 |            6 |     default |       #1.6 |
+-----+---------+--------------+-------------+------------+

Аналогичный запрос, но на этот раз один заказ с многими предметами:

#standardSQL
WITH Orders AS (
  SELECT 1 AS __oid__, ARRAY[STRUCT(STRUCT(1 AS id, "default" AS ns) AS key), STRUCT(STRUCT(2 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 2 AS __oid__, ARRAY[STRUCT(STRUCT(3 AS id, "default" AS ns) AS key), STRUCT(STRUCT(4 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 3 AS __oid__, ARRAY[STRUCT(STRUCT(5 AS id, "default" AS ns) AS key), STRUCT(STRUCT(6 AS id, "default" AS ns) AS key)] AS items
),
Items AS (
  SELECT STRUCT(1 AS id, "default" AS ns) AS key, "#1.1" AS title UNION ALL
  SELECT STRUCT(2 AS id, "default" AS ns) AS key, "#1.2" AS title UNION ALL
  SELECT STRUCT(3 AS id, "default" AS ns) AS key, "#1.3" AS title UNION ALL
  SELECT STRUCT(4 AS id, "default" AS ns) AS key, "#1.4" AS title UNION ALL
  SELECT STRUCT(5 AS id, "default" AS ns) AS key, "#1.5" AS title UNION ALL
  SELECT STRUCT(6 AS id, "default" AS ns) AS key, "#1.6" AS title
)

SELECT
   __oid__
  ,Order_items AS items
FROM Orders  

LEFT OUTER JOIN(
  SELECT
     key
    ,title
  FROM Items
) Order_items
ON Order_items.key.id IN (SELECT item.key.id FROM UNNEST(items) AS item)

Ошибка: Подзапрос IN не поддерживается внутри предиката соединения.

Я действительно ожидал этого результата:

+-----+---------+--------------+-------------+------------+
| Row | __oid__ |  item.key.id | item.key.ns | item.title |
+-----+---------+--------------+-------------+------------+
|   1 |       1 |            1 |     default |       #1.1 |
|     |         |            2 |     default |       #1.2 |
+-----+---------+--------------+-------------+------------+
|   2 |       2 |            3 |     default |       #1.3 |
|     |         |            4 |     default |       #1.4 |
+-----+---------+--------------+-------------+------------+
|   3 |       3 |            5 |     default |       #1.5 |
|     |         |            6 |     default |       #1.6 |
+-----+---------+--------------+-------------+------------+

Как мне изменить второй запрос, чтобы получить ожидаемый результат?

Ответы [ 2 ]

0 голосов
/ 02 июля 2018

Альтернативный вариант - сделать CROSS JOIN вместо LEFT JOIN

#standardSQL
WITH Orders AS (
  SELECT 1 AS __oid__, ARRAY[STRUCT(STRUCT(1 AS id, "default" AS ns) AS key), STRUCT(STRUCT(2 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 2 AS __oid__, ARRAY[STRUCT(STRUCT(3 AS id, "default" AS ns) AS key), STRUCT(STRUCT(4 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 3 AS __oid__, ARRAY[STRUCT(STRUCT(5 AS id, "default" AS ns) AS key), STRUCT(STRUCT(6 AS id, "default" AS ns) AS key)] AS items
),
Items AS (
  SELECT STRUCT(1 AS id, "default" AS ns) AS key, "#1.1" AS title UNION ALL
  SELECT STRUCT(2 AS id, "default" AS ns) AS key, "#1.2" AS title UNION ALL
  SELECT STRUCT(3 AS id, "default" AS ns) AS key, "#1.3" AS title UNION ALL
  SELECT STRUCT(4 AS id, "default" AS ns) AS key, "#1.4" AS title UNION ALL
  SELECT STRUCT(5 AS id, "default" AS ns) AS key, "#1.5" AS title UNION ALL
  SELECT STRUCT(6 AS id, "default" AS ns) AS key, "#1.6" AS title
)

SELECT
   __oid__
  ,ARRAY_AGG(Order_items) AS items
FROM Orders  

CROSS JOIN(
  SELECT
     key
    ,title
  FROM Items
) Order_items
WHERE Order_items.key.id IN (SELECT item.key.id FROM UNNEST(items) AS item)
GROUP BY __oid__
0 голосов
/ 02 июля 2018

Проблема в том, что BigQuery не может хэшировать ключи соединения с двух сторон (поскольку соединение выражается как условие IN). Вы можете сделать это, сгладив массив с левой стороны, а затем агрегируя элементы справа:

#standardSQL
WITH Orders AS (
  SELECT 1 AS __oid__, ARRAY[STRUCT(STRUCT(1 AS id, "default" AS ns) AS key), STRUCT(STRUCT(2 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 2 AS __oid__, ARRAY[STRUCT(STRUCT(3 AS id, "default" AS ns) AS key), STRUCT(STRUCT(4 AS id, "default" AS ns) AS key)] AS items UNION ALL 
  SELECT 3 AS __oid__, ARRAY[STRUCT(STRUCT(5 AS id, "default" AS ns) AS key), STRUCT(STRUCT(6 AS id, "default" AS ns) AS key)] AS items
),
Items AS (
  SELECT STRUCT(1 AS id, "default" AS ns) AS key, "#1.1" AS title UNION ALL
  SELECT STRUCT(2 AS id, "default" AS ns) AS key, "#1.2" AS title UNION ALL
  SELECT STRUCT(3 AS id, "default" AS ns) AS key, "#1.3" AS title UNION ALL
  SELECT STRUCT(4 AS id, "default" AS ns) AS key, "#1.4" AS title UNION ALL
  SELECT STRUCT(5 AS id, "default" AS ns) AS key, "#1.5" AS title UNION ALL
  SELECT STRUCT(6 AS id, "default" AS ns) AS key, "#1.6" AS title
)

SELECT
   __oid__
  ,ARRAY_AGG(Order_items) AS items
FROM Orders,
UNNEST(items) AS item

LEFT OUTER JOIN(
  SELECT
     key
    ,title
  FROM Items
) Order_items
ON Order_items.key.id = item.key.id
GROUP BY __oid__

Это похоже на то, что вы хотели в любом случае, поскольку ваш исходный запрос имел бы items просто как структуру, а не как массив структур.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...