Консолидация нескольких полей в одно - PullRequest
0 голосов
/ 14 февраля 2019

В настоящее время у меня есть данные, которые выглядят примерно так:

+------+------------------------------------------------------------+--------------------------+
|  id  |                          question                          |         response         |
+------+------------------------------------------------------------+--------------------------+
| 1234 | What did you enjoy the most about your experience with us? | Delivery                 |
| 1234 | What did you enjoy the most about your experience with us? | Customer Service         |
| 1234 | What about our Customer Service could we improve?          | Response Time            |
| 1234 | What about our Customer Service could we improve?          | Less Email               |
| 1234 | What other products would you like to see us make?         | Table                    |
| 5678 | What about our Customer Service could we improve?          | Response Time            |
| 5678 | What about our Customer Service could we improve?          | Site Navigation          |
| 5678 | What other products would you like to see us make?         | Bookshelf                |
| 5678 | What other products would you like to see us make?         | Table                    |
| 5678 | What other products would you like to see us make?         | Chairs                   |
| 9999 | What did you enjoy the most about your experience with us? | Customer Service         |
| 9999 | What did you enjoy the most about your experience with us? | Ease of Assembly         |
| 9999 | What did you enjoy the most about your experience with us? | Pricing                  |
| 9999 | What about our delivery could we improve?                  | Shipping Time            |
| 9999 | What about our delivery could we improve?                  | Custom Delivery          |
| 9999 | What other products would you like to see us make?         | Bookshelf                |
+------+------------------------------------------------------------+--------------------------+

Вы заметите, что не только каждый вопрос имеет свою собственную строку, но есть повторяющиеся строки question на id с различными ответами вresponse.Что может быть сложным, так это отсутствие согласованности между тем, сколько ответов дает ID на вопрос.5678 дал три ответа на What other products would you like to see us make?, а 9999 ответил только на один.Я не уверен, что это уместно, но количество ответов, в которых ID может дать вопрос, никогда не превысит четырех.Ответы предварительно заданы из списка.

Я хотел бы отформатировать свои данные таким образом, чтобы получить ответ 1: 1 между question и response, например:

+------+------------------------------------------------------------+---------------------------------------------+
|  id  |                          question                          |                  response                   |
+------+------------------------------------------------------------+---------------------------------------------+
| 1234 | What did you enjoy the most about your experience with us? | Delivery, Customer Service                  |
| 1234 | What about our Customer Service could we improve?          | Response Time, Less Email                   |
| 1234 | What other products would you like to see us make?         | Table                                       |
| 5678 | What about our Customer Service could we improve?          | Response Time, Site Navigation              |
| 5678 | What other products would you like to see us make?         | Bookshelf, Table, Chairs                    |
| 9999 | What did you enjoy the most about your experience with us? | Customer Service, Ease of Assembly, Pricing |
| 9999 | What about our delivery could we improve?                  | Shipping Time, Custom Delivery              |
| 9999 | What other products would you like to see us make?         | Bookshelf                                   |
+------+------------------------------------------------------------+---------------------------------------------+

Было бы полезно, чтобы ответы разделялись запятыми, но я не уверен, нужно ли это делать с помощью какой-либо формы конкатенации над разделом или есть какая-то встроенная функция, которая может это сделать.

1 Ответ

0 голосов
/ 14 февраля 2019

Ниже для BigQuery Standard SQL

#standardSQL
SELECT id, question, STRING_AGG(response, ', ') response
FROM `project.dataset.table`
GROUP BY id, question

Вы можете протестировать, поиграть с выше, используя примеры данных из вашего вопроса, как в примере ниже

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1234 id, 'What did you enjoy the most about your experience with us?' question, 'Delivery' response UNION ALL
  SELECT 1234, 'What did you enjoy the most about your experience with us?', 'Customer Service' UNION ALL
  SELECT 1234, 'What about our Customer Service could we improve?', 'Response Time' UNION ALL
  SELECT 1234, 'What about our Customer Service could we improve?', 'Less Email' UNION ALL
  SELECT 1234, 'What other products would you like to see us make?', 'Table' UNION ALL
  SELECT 5678, 'What about our Customer Service could we improve?', 'Response Time' UNION ALL
  SELECT 5678, 'What about our Customer Service could we improve?', 'Site Navigation' UNION ALL
  SELECT 5678, 'What other products would you like to see us make?', 'Bookshelf' UNION ALL
  SELECT 5678, 'What other products would you like to see us make?', 'Table' UNION ALL
  SELECT 5678, 'What other products would you like to see us make?', 'Chairs' UNION ALL
  SELECT 9999, 'What did you enjoy the most about your experience with us?', 'Customer Service' UNION ALL
  SELECT 9999, 'What did you enjoy the most about your experience with us?', 'Ease of Assembly' UNION ALL
  SELECT 9999, 'What did you enjoy the most about your experience with us?', 'Pricing' UNION ALL
  SELECT 9999, 'What about our delivery could we improve?', 'Shipping Time' UNION ALL
  SELECT 9999, 'What about our delivery could we improve?', 'Custom Delivery' UNION ALL
  SELECT 9999, 'What other products would you like to see us make?', 'Bookshelf' 
)
SELECT id, question, STRING_AGG(response, ', ') response
FROM `project.dataset.table`
GROUP BY id, question
-- ORDER BY id, question

с результатом

Row id      question                                                    response     
1   1234    What about our Customer Service could we improve?           Response Time, Less Email    
2   1234    What did you enjoy the most about your experience with us?  Delivery, Customer Service   
3   1234    What other products would you like to see us make?          Table    
4   5678    What about our Customer Service could we improve?           Response Time, Site Navigation   
5   5678    What other products would you like to see us make?          Bookshelf, Table, Chairs     
6   9999    What about our delivery could we improve?                   Shipping Time, Custom Delivery   
7   9999    What did you enjoy the most about your experience with us?  Customer Service, Ease of Assembly, Pricing  
8   9999    What other products would you like to see us make?          Bookshelf    
...