Мне нужно объединить все NOTI_TEXT, соответствующие NOTI_ID. Один NOTI_ID может иметь несколько NOTI_TEXT. Я использую XMLAGG, но у него заканчивается спул.
Ниже приведен запрос:
select
NOTI_ID,
cast(XMLAGG(NOTI_TEXT order by NOTI_TEXT_LINE_ID) as varchar(32000)) as NOTI_TEXT,
NOTI_COUNTRY_ID,
NOTI_MAT_DIVISION_ID,
NOTI_MAT_DIVISION_TEXT,
NOTI_SOURCESYSTEM_ID,
CURRENT_DATE as TABLE_LOAD_DT
from
HC_PRD_D_RDDL_SDTB_0_1_0_0_0_0_0_0.SDTB_DM_SEV_111_NOTI_TXT_LINES_TEST_1
group by
NOTI_ID,
NOTI_COUNTRY_ID,
NOTI_MAT_DIVISION_ID,
NOTI_MAT_DIVISION_TEXT,
NOTI_SOURCESYSTEM_ID
Все соответствующие статистические данные были собраны. Коэффициент перекоса исходной таблицы равен 1,5. Ниже приведен план EXPLAIN:
Explain select
NOTI_ID,
cast(XMLAGG(NOTI_TEXT order by NOTI_TEXT_LINE_ID) as varchar(32000)) as NOTI_TEXT,
NOTI_COUNTRY_ID,
NOTI_MAT_DIVISION_ID,
NOTI_MAT_DIVISION_TEXT,
NOTI_SOURCESYSTEM_ID,
CURRENT_DATE as TABLE_LOAD_DT
from
HC_PRD_D_RDDL_SDTB_0_1_0_0_0_0_0_0.SDTB_DM_SEV_111_NOTI_TXT_LINES_TEST_1
group by
NOTI_ID,
NOTI_COUNTRY_ID,
NOTI_MAT_DIVISION_ID,
NOTI_MAT_DIVISION_TEXT,
NOTI_SOURCESYSTEM_ID;
1) First, we lock
HC_PRD_D_RDDL_SDTB_0_1_0_0_0_0_0_0.SDTB_DM_SEV_111_NOTI_TXT_LINES_TEST
_1 for read on a reserved RowHash in all partitions to prevent
global deadlock.
2) Next, we lock
HC_PRD_D_RDDL_SDTB_0_1_0_0_0_0_0_0.SDTB_DM_SEV_111_NOTI_TXT_LINES_TEST
_1 for read.
3) We do an all-AMPs SUM step to aggregate from
HC_PRD_D_RDDL_SDTB_0_1_0_0_0_0_0_0.SDTB_DM_SEV_111_NOTI_TXT_LINES_TEST
_1 by way of an all-rows scan with no residual conditions, and the
grouping identifier in field 1. Aggregate Intermediate Results
are computed globally, then placed in Spool 3. The input table
will not be cached in memory, but it is eligible for synchronized
scanning. The aggregate spool file will not be cached in memory.
The size of Spool 3 is estimated with high confidence to be
13,749,188 rows (64,456,193,344 bytes). The estimated time for
this step is 15 hours and 20 minutes.
4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The result spool file will not be cached in memory.
The size of Spool 1 is estimated with high confidence to be
13,749,188 rows (148,092,503,948 bytes). The estimated time for
this step is 4 minutes and 10 seconds.
5) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 15 hours and 24 minutes.
Используя эту таблицу для других запросов, мы никогда не обнаружили каких-либо аномалий. Я хочу проверить, может ли он быть дополнительно оптимизирован или какая-либо альтернатива для достижения того же