Я новичок в PostgreSQL и PostGIS, поэтому этот вопрос может быть глупым, я хочу знать, как оптимизировать эту ситуацию.
Вот подробности
Версия PostgreSQL: 10.10
индекс таблицы и информация о поле:
testgis=# \d test;
Table "public.test"
Column | Type | Collation | Nullable | Default
------------+------------------+-----------+----------+--------------------------------
id | bigint | | not null | nextval('serial_id'::regclass)
location | geography | | not null |
latitude | double precision | | not null |
longitude | double precision | | not null |
time_range | tsrange | | |
int1 | integer | | not null |
int2 | integer | | not null |
ids1 | bigint[] | | |
ids2 | bigint[] | | |
Indexes:
"btree_int1" btree (int1)
"btree_int2" btree (int2)
"gin_ids1" gin (ids1)
"gin_ids2" gin (ids2)
"gist_location" gist (location)
"gist_time_range" gist (time_range)
информация о размере:
SELECT row_estimate,pg_size_pretty(total_bytes) AS total
, pg_size_pretty(index_bytes) AS INDEX
, pg_size_pretty(toast_bytes) AS toast
, pg_size_pretty(table_bytes) AS TABLE
FROM (
SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM (
SELECT relname AS TABLE_NAME
, c.reltuples AS row_estimate
, pg_total_relation_size(c.oid) AS total_bytes
, pg_indexes_size(c.oid) AS index_bytes
, pg_total_relation_size(reltoastrelid) AS toast_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r' and relname='test'
) a
) a;
row_estimate | total | index | toast | table
--------------+--------+-------+------------+-------
302471 | 148 MB | 80 MB | 8192 bytes | 68 MB
location
поле является точкой, SQL вставки имеет вид:
INSERT INTO test (location,latitude,longitude,time_range,int1,int2,ids1,ids2)
VALUES
(ST_GeographyFromText('POINT(106.800382 -6.098953)'), -6.098953, 106.800382, '[2019-09-01 00:00:00, 2019-09-20 00:00:00]', 1, 2, '{100, 101}', '{50}')
вот мой запрос SQL и план запроса:
explain (analyze, buffers) select id
from test
where
location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') < 15000
and
(ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[])
order by location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') limit 800;
```none
Limit (cost=0.28..8858.30 rows=800 width=16) (actual time=1.126..28.605 rows=800 loops=1)
Buffers: shared hit=7408
-> Index Scan using gist_location on test (cost=0.28..131730.06 rows=11897 width=16) (actual time=1.126..28.507 rows=800 loops=1)
Order By: (location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography)
Filter: (((ids1 @> '{100}'::bigint[]) OR (ids2 @> '{100}'::bigint[])) AND ((location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography) < '15000'::double precision))
Rows Removed by Filter: 5840
Buffers: shared hit=7408
Planning time: 0.398 ms
Execution time: 28.729 ms
(9 rows)
Если я изменю (ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[])
на (ids1 @> ARRAY[1]::bigint[] or ids2 @> ARRAY[1]::bigint[])
(с 100 на 1), план запроса изменится на:
Limit (cost=8104.48..8106.48 rows=800 width=16) (actual time=10.106..10.147 rows=209 loops=1)
Buffers: shared hit=3201
-> Sort (cost=8104.48..8107.48 rows=1200 width=16) (actual time=10.105..10.123 rows=209 loops=1)
Sort Key: ((location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography))
Sort Method: quicksort Memory: 34kB
Buffers: shared hit=3201
-> Bitmap Heap Scan on test (cost=67.67..8043.10 rows=1200 width=16) (actual time=1.691..10.032 rows=209 loops=1)
Recheck Cond: ((ids1 @> '{1}'::bigint[]) OR (ids2 @> '{1}'::bigint[]))
Filter: ((location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography) < '15000'::double precision)
Rows Removed by Filter: 3376
Heap Blocks: exact=3185
Buffers: shared hit=3201
-> BitmapOr (cost=67.67..67.67 rows=3609 width=0) (actual time=0.982..0.982 rows=0 loops=1)
Buffers: shared hit=10
-> Bitmap Index Scan on gin_ids1 (cost=0.00..32.17 rows=1623 width=0) (actual time=0.622..0.622 rows=2030 loops=1)
Index Cond: (ids1 @> '{1}'::bigint[])
Buffers: shared hit=5
-> Bitmap Index Scan on gin_ids2 (cost=0.00..34.90 rows=1986 width=0) (actual time=0.359..0.359 rows=1960 loops=1)
Index Cond: (ids2 @> '{1}'::bigint[])
Buffers: shared hit=5
Planning time: 0.237 ms
Execution time: 10.215 ms
(22 rows)
число (ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[])
равно 1932, больше, чем число (ids1 @> ARRAY[1]::bigint[] or ids2 @> ARRAY[1]::bigint[])
, которое равно 209
testgis=# select count(*)
testgis-# from test
testgis-# where
testgis-# location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') < 15000
testgis-# and
testgis-# (ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[]);
count
-------
1932
(1 row)
testgis=# select count(*)
testgis-# from test
testgis-# where
testgis-# location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') < 15000
testgis-# and
testgis-# (ids1 @> ARRAY[1]::bigint[] or ids2 @> ARRAY[1]::bigint[]);
count
-------
209
(1 row)
testgis=# select count(*) from test where (ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[]);
count
-------
36489
Так как я могу оптимизировать этот SQL (lng, lat,радиус, идентификатор является переменной)?
select id
from test
where
location <-> ST_GeographyFromText('POINT(<lng> <lat>)') < <radius>
and
(ids1 @> ARRAY[<id>]::bigint[] or ids2 @> ARRAY[<id>]::bigint[])
order by location <-> ST_GeographyFromText('POINT(<lng> <lat>)') limit 800;
После установки для enable_indexscan значения 0 план запроса:
testgis=# BEGIN; SET LOCAL enable_indexscan=0; explain (analyze, buffers) select id
BEGIN
SET
testgis-# from test
testgis-# where
testgis-# location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') < 15000
testgis-# and
testgis-# (ids1 @> ARRAY[100]::bigint[] or ids2 @> ARRAY[100]::bigint[])
testgis-# order by location <-> ST_GeographyFromText('POINT(106.800382 -6.098953)') limit 800; ROLLBACK;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=15627.24..15720.58 rows=800 width=16) (actual time=40.887..43.748 rows=800 loops=1)
Buffers: shared hit=6082
-> Gather Merge (cost=15627.24..16783.95 rows=9914 width=16) (actual time=40.886..43.648 rows=800 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=6082
-> Sort (cost=14627.21..14639.61 rows=4957 width=16) (actual time=24.839..24.882 rows=431 loops=3)
Sort Key: ((location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography))
Sort Method: quicksort Memory: 100kB
Buffers: shared hit=6082
-> Parallel Bitmap Heap Scan on test (cost=359.15..14322.97 rows=4957 width=16) (actual time=3.468..24.555 rows=644 loops=3)
Recheck Cond: ((ids1 @> '{100}'::bigint[]) OR (ids2 @> '{100}'::bigint[]))
Filter: ((location <-> '0101000020E6100000A7936C7539B35A40465D6BEF536518C0'::geography) < '15000'::double precision)
Rows Removed by Filter: 11519
Heap Blocks: exact=5140
Buffers: shared hit=6066
-> BitmapOr (cost=359.15..359.15 rows=35893 width=0) (actual time=4.661..4.661 rows=0 loops=1)
Buffers: shared hit=16
-> Bitmap Index Scan on gin_ids1 (cost=0.00..319.81 rows=34109 width=0) (actual time=4.163..4.163 rows=35123 loops=1)
Index Cond: (ids1 @> '{100}'::bigint[])
Buffers: shared hit=11
-> Bitmap Index Scan on gin_ids2 (cost=0.00..33.38 rows=1785 width=0) (actual time=0.496..0.496 rows=2003 loops=1)
Index Cond: (ids2 @> '{100}'::bigint[])
Buffers: shared hit=5
Planning time: 0.194 ms
Execution time: 43.847 ms
(26 rows)
ROLLBACK