Преобразование устаревшей функции SQL Flatten в стандартный SQL - PullRequest
0 голосов
/ 06 марта 2019

У меня в #LegacySQL написано следующее:

   SELECT
      customer_email,
      submitted_at,
      title,
      answers.choices.labels answer_choices,
      answers.number score,
      answers.boolean true_false,
      metadata.platform device_type
    FROM
      (FLATTEN([test-test:sample.responses], answers)) resp
      LEFT JOIN [test-test:sample.forms] forms
    ON resp.answers.field.id = forms.id
    ORDER BY 1 ASC

Он возвращает результаты так, как я хочу, вот так:

+------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+
|  customer_email  |      submitted_at       |                           title                            |   answer_choices   | score | true_false | device_type |
+------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+
| myname@gmail.com | 2018-12-25 04:00:02 UTC | How would you rate this product?                           |                    |    10 |            | other       |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | What did you enjoy the most about your experience with us? | Delivery           |       |            | other       |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | What other product(s) would you like to see us make?       | Additional Colors  |       |            | other       |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | What other product(s) would you like to see us make?       | Additional Designs |       |            | other       |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | What color(s) would you want to see?                       | Green              |       |            | other       |
+------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+

Я пытаюсь преобразовать в BigQuery StandarSQL, и я собрал следующее:

SELECT
  customer_email,
  submitted_at,
  title,
  answers.choices.labels answer_choices,
  answers.number score,
  answers.boolean true_false,
  metadata.platform device_type
FROM
`sample.responses` resp, unnest(answers) answers
LEFT JOIN `sample.forms` forms
ON answers.field.id  = forms.id
ORDER BY 1 ASC

К сожалению, он возвращает его в виде записей:

+------------------+-------------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+
|  customer_email  |      submitted_at       |        landed_at        |                           title                            |   answer_choices   | score | true_false | device_type |
+------------------+-------------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+
| myname@gmail.com | 2018-12-25 04:00:02 UTC | 2018-12-25 03:59:07 UTC | What did you enjoy the most about your experience with us? | Delivery           | null  | null       | other       |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | 2018-12-25 03:59:07 UTC | What other product(s) would you like to see us make?       | Additional Colors  | null  | null       | other       |
|                  |                         |                         |                                                            | Additional Designs |       |            |             |
| myname@gmail.com | 2018-12-25 04:00:02 UTC | 2018-12-25 03:59:07 UTC | What color(s) would you want to see?                       | Green              | null  | null       | other       |
+------------------+-------------------------+-------------------------+------------------------------------------------------------+--------------------+-------+------------+-------------+

Что я делаю неправильно?

Ответы [ 2 ]

0 голосов
/ 07 марта 2019

Предполагая, что это запрос для построения начальных данных

SELECT
  'myname@gmail.com' AS customer_email,
  CAST('2018-12-25 04:00:02 UTC' AS TIMESTAMP) AS submitted_at,
  CAST('2018-12-25 03:59:07 UTC' AS TIMESTAMP) AS landed_at,
  'How would you rate this product?' as title,
  STRUCT (
    [
      STRUCT('' as label)
    ] AS choices,
    10 as number,
    NULL as boolean
  ) answers,
  STRUCT (
    'other' AS platform
  ) metadata

UNION ALL

SELECT
  'myname@gmail.com' AS customer_email,
  CAST('2018-12-25 04:00:02 UTC' AS TIMESTAMP) AS submitted_at,
  CAST('2018-12-25 03:59:07 UTC' AS TIMESTAMP) AS landed_at,
  'What did you enjoy the most about your experience with us?' as title,
  STRUCT (
    [
      STRUCT('' as label)
    ] AS choices,
    NULL as number,
    NULL as boolean
  ) answers,
  STRUCT (
    'other' AS platform
  ) metadata

UNION ALL

SELECT
  'myname@gmail.com' AS customer_email,
  CAST('2018-12-25 04:00:02 UTC' AS TIMESTAMP) AS submitted_at,
  CAST('2018-12-25 03:59:07 UTC' AS TIMESTAMP) AS landed_at,
  'What other product(s) would you like to see us make?' as title,
  STRUCT (
    [
      STRUCT('Additional Designs' as label),
      STRUCT('Additional Colors' as label)
    ] AS choices,
    NULL as number,
    NULL as boolean
  ) answers,
  STRUCT (
    'other' AS platform
  ) metadata

UNION ALL

SELECT
  'myname@gmail.com' AS customer_email,
  CAST('2018-12-25 04:00:02 UTC' AS TIMESTAMP) AS submitted_at,
  CAST('2018-12-25 03:59:07 UTC' AS TIMESTAMP) AS landed_at,
  'What color(s) would you want to see?' as title,
  STRUCT (
    [
      STRUCT('Green' as label)
    ] AS choices,
    NULL as number,
    NULL as boolean
  ) answers,
  STRUCT (
    'other' AS platform
  ) metadata

Тогда ваш запрос будет

WITH joined_table AS (
    SELECT
        customer_email,
        submitted_at,
        landed_at,
        title,
        answers.choices answers_choices,
        answers.number score,
        answers.boolean true_false,
        metadata.platform device_type
    FROM `sample.responses` resp
    LEFT JOIN `sample.forms` forms ON resp.answers.field.id  = forms.id
)

SELECT 
    customer_email,
    submitted_at,
    landed_at,
    title,
    unnested_answers_choices.label as answer_choices,
    score,
    true_false,
    device_type
FROM joined_table
CROSS JOIN UNNEST(answers_choices) AS unnested_answers_choices
0 голосов
/ 06 марта 2019

Следует иметь в виду, что унаследованный SQL выполняет LEFT JOIN с массивами при их выравнивании, тогда как стандартный оператор запятой SQL выполняет обычный JOIN. Если вы используете LEFT JOIN при удалении массива, вы получите строку в выводе (с NULL для значения элемента), когда массив пуст, тогда как если вы используете обычный JOIN, строка будет опущено. Это похоже на то, как работают соединения между таблицами. См. руководство по миграции для более подробной информации о семантике.

Я думаю, что вы хотите, чтобы ваш запрос использовал LEFT JOIN между таблицей и массивом answers:

SELECT
  customer_email,
  submitted_at,
  title,
  answer.choices.labels AS answer_choices,
  answer.number score,
  answer.boolean true_false,
  metadata.platform device_type
FROM
  `sample.responses` resp
LEFT JOIN UNNEST(resp.answers) AS answer
LEFT JOIN `sample.forms` forms
ON answer.field.id  = forms.id
ORDER BY 1 ASC

Если вы также хотите сгладить answer.choices.labels, вы можете изменить его на:

SELECT
  customer_email,
  submitted_at,
  title,
  label AS answer_choices,
  answer.number score,
  answer.boolean true_false,
  metadata.platform device_type
FROM
  `sample.responses` resp
LEFT JOIN UNNEST(resp.answers) AS answer
LEFT JOIN UNNEST(answer.choices.labels) AS label
LEFT JOIN `sample.forms` forms
ON answer.field.id  = forms.id
ORDER BY 1 ASC
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...