У меня есть запрос, подобный приведенному ниже:
SELECT
MAX(m.org_id) as orgId,
MAX(m.org_name) as orgName,
MAX(m.app_id) as appId,
MAX(r.country_or_region) as country,
MAX(r.local_spend_currency) as currency,
SUM(r.local_spend_amount) as spend,
SUM(r.impressions) as impressions
...
FROM report r
LEFT JOIN metadata m
ON m.org_id = r.org_id
AND m.campaign_id = r.campaign_id
AND m.ad_group_id = r.ad_group_id
WHERE (r.report_date BETWEEN '2019-01-01' AND '2019-10-10')
AND r.org_id IN (1138740,1212430,1236970,1238450,1238520,1200980, .... more than 50)
GROUP BY r.country_or_region, r.ad_group_id, r.keyword_id, r.keyword, r.text
OFFSET 0
LIMIT 20
И результаты объяснения выглядят следующим образом:
Limit (cost=24.47..24.57 rows=1 width=681)
-> GroupAggregate (cost=24.47..24.57 rows=1 width=681)
Group Key: r.country_or_region, r.ad_group_id, r.keyword_id, r.keyword, r.text
-> Sort (cost=24.47..24.48 rows=1 width=181)
Sort Key: r.country_or_region, r.ad_group_id, r.keyword_id, r.keyword, r.text
-> Nested Loop Left Join (cost=12.28..24.46 rows=1 width=181)
-> Bitmap Heap Scan on report r (cost=12.00..16.15 rows=1 width=82)
Recheck Cond: ((report_date >= '2019-01-01'::date) AND (report_date <= '2019-10-10'::date) AND (org_id = ANY ('{1138740,1212430,1236970,1238450,1238520,1200980 ...}'::numeric[])))
-> Bitmap Index Scan on idx_date_org_cty (cost=0.00..12.00 rows=1 width=0)
Index Cond: ((report_date >= '2019-01-01'::date) AND (report_date <= '2019-10-10'::date) AND (org_id = ANY ('{1138740,1212430,1236970,1238450,1238520,1200980,1221910 ...}'::numeric[])))
-> Index Scan using idx_16569_primary on ad_group_metadata m (cost=0.28..8.30 rows=1 width=114)
Index Cond: ((org_id = r.org_id) AND (campaign_id = r.campaign_id) AND (ad_group_id = r.ad_group_id))
Реализовано: "VACUUM FULL ANALYZE".
ОбъяснитьАнализ:
Limit (cost=854136.16..854138.08 rows=20 width=563) (actual time=1755.154..1755.369 rows=20 loops=1)
-> GroupAggregate (cost=854136.16..874013.30 rows=206845 width=563) (actual time=1755.153..1755.363 rows=20 loops=1)
Group Key: r.country_or_region, r.ad_group_id, r.keyword_id, r.keyword, r.text
-> Sort (cost=854136.16..854661.68 rows=210206 width=222) (actual time=1755.069..1755.122 rows=196 loops=1)
Sort Key: r.country_or_region, r.ad_group_id, r.keyword_id, r.keyword, r.text
Sort Method: external merge Disk: 52960kB
-> Hash Left Join (cost=3196.06..813278.43 rows=210206 width=222) (actual time=113.734..1384.338 rows=152571 loops=1)
Hash Cond: ((r.org_id = m.org_id) AND (r.campaign_id = m.campaign_id) AND (r.ad_group_id = m.ad_group_id))
-> Index Scan using idx_orgid_date_campid on report r (cost=0.56..800305.56 rows=210206 width=119) (actual time=19.898..1192.910 rows=152571 loops=1)
Index Cond: ((org_id = ANY ('{1138740,1212430,1236970,1238450,1238520 ...}'::bigint[])) AND (report_date >= '2019-09-01'::date) AND (report_date <= '2019-10-10'::date))
-> Hash (cost=1739.09..1739.09 rows=41509 width=119) (actual time=93.659..93.659 rows=41509 loops=1)
Buckets: 32768 Batches: 2 Memory Usage: 3550kB
-> Seq Scan on ad_group_metadata m (cost=0.00..1739.09 rows=41509 width=119) (actual time=0.006..76.137 rows=41509 loops=1)
Planning Time: 0.815 ms
Execution Time: 1762.834 ms
В моем запросе нет сортировки, но объяснение анализа показывает дополнительную сортировку. Я не мог понять, почему?
Этот запрос очень хорошо работает для одного значения IN. Но для большего количества значений IN это не эффективно. Как я могу улучшить это?