Создание многоуровневой вложенной структуры с помощью объединений в BigQuery (с использованием вложенных и повторяющихся полей) - PullRequest
1 голос
/ 06 июля 2019

У меня есть несколько плоских таблиц в BigQuery, и я хочу объединить их в одну таблицу, которая использует вложенные и повторяющиеся поля на разных уровнях (три здесь, но потенциально больше уровней в будущем).

Iбыл в состоянии сделать это для одного уровня в соответствии с методами в документах / видео, но я не могу получить правильный синтаксис для нескольких уровней.

#dummy data to demonstrate hierarchy (travellers->cities->places)

WITH 
travellers AS (
SELECT 'Jim' as traveller, 'England' as country UNION ALL
SELECT 'Jim' as traveller, 'Spain' as country UNION ALL
SELECT 'Jill' as traveller, 'France' as country),

cities AS (
SELECT 'England' as country, 'London' as city UNION ALL
SELECT 'England' as country, 'Liverpool' as city UNION ALL
SELECT 'England' as country, 'Manchester' as city  UNION ALL
SELECT 'France' as country, 'Paris' as city UNION ALL
SELECT 'France' as country, 'Nantes' as city UNION ALL
SELECT 'France' as country, 'Marseille' as city  UNION ALL
SELECT 'Spain' as country, 'Granada' as city UNION ALL
SELECT 'Spain' as country, 'Barcelona' as city UNION ALL
SELECT 'Spain' as country, 'Madrid' as city),

places AS (
SELECT 'London' as city, 'Buckingham Palace' as place UNION ALL
SELECT 'London' as city, 'Tooting Bec Lido' as place UNION ALL
SELECT 'Liverpool' as city, 'The Liver Building' as place  UNION ALL
SELECT 'Manchester' as city, 'Old Trafford' as place UNION ALL
SELECT 'Paris' as city, 'Notre Dame' as place UNION ALL
SELECT 'Paris' as city, 'Louvre' as place  UNION ALL
SELECT 'Nantes' as city, 'La Machine' as place UNION ALL
SELECT 'Marseille' as city, 'Le Stade' as place UNION ALL
SELECT 'Granada' as city, 'Alhambra' as place UNION ALL
SELECT 'Granada' as city, 'El Bar de Fede' as place UNION ALL
SELECT 'Barcelona' as city, 'Camp Nou' as place UNION ALL
SELECT 'Madrid' as city, 'Sofia Reina' as place UNION ALL
SELECT 'Madrid' as city, 'El Bar de Edu' as place UNION ALL
SELECT 'Barcelona' as city, 'La Playa' as place UNION ALL
SELECT 'Granada' as city, 'Cafe Andarax' as place),

# full table using typical join (not what I wnat)
full_array_flat as (SELECT * FROM travellers LEFT JOIN cities USING(country) LEFT JOIN places USING(city)),

# simple nesting at a single level (using STRUCT as I will need multiple levels in future, and will need to include additional fields of different types)
travellers_nested AS (SELECT traveller, ARRAY_AGG(STRUCT (country)) as country_array FROM travellers GROUP BY traveller),
cities_nested AS (SELECT country, ARRAY_AGG(STRUCT (city)) as city_array FROM cities GROUP BY country),
places_nested AS (SELECT city, ARRAY_AGG(STRUCT (place)) as place_array FROM places GROUP BY city),

# flattening nested arrays just for fun (!)... trying to test out different combinations
travellers_nested_flattened AS (SELECT traveller, country_flat from travellers_nested, UNNEST(country_array) as country_flat),
cities_nested_flattened AS (SELECT country, city_flat from cities_nested, UNNEST(city_array) as city_flat),
places_nested_flattened AS (SELECT city, place_flat from places_nested, UNNEST(place_array) as place_flat)

# SELECT * FROM travellers_cities_places 
SELECT "WHY OH WHY CAN'T I FIGURE THIS OUT, PLEASE HELP ME SOMEBODY!)" AS cry_for_help 

Представление JSON ожидаемого вывода, например,

[
  {
    "traveller": "Jim",
    "country_array": [
      {
        "country": "England",
        "city_array": [
          {
            "city": "London",
            "place_array": [
              {
                "place": "Buckingham Palace"
              },
              {
                "place": "Tooting Bec Lido"
              }
            ] ...

Однако никакая комбинация ARRAY, STRUCT, UNNESTing или JOINing, кажется, не дает мне ничего подобного этому выводу ... пожалуйста, помогите.Спасибо.

1 Ответ

0 голосов
/ 06 июля 2019

Создайте нужную структуру, одну агрегацию за раз.

Затем преобразуйте результат в строку:

SELECT TO_JSON_STRING(STRUCT(traveller,
                             ARRAY_AGG(STRUCT(country, city_array)) as country_array
                            )
                     )
FROM (SELECT traveller, country,
             ARRAY_AGG(STRUCT(city, place_array)) as city_array
      FROM (SELECT t.traveller, t.country, c.city, ARRAY_AGG(p.place) as place_array
            FROM travellers t JOIN
                 cities c
                 ON t.country = c.country JOIN
                 places p
                 ON c.city = p.city
            GROUP BY t.traveller, t.country, c.city
           ) tcc
      GROUP BY traveller, country
     ) tc
GROUP BY traveller;
...