PostgreSQL очень медленно при фильтрации по индексу - PullRequest
1 голос
/ 27 апреля 2020

У меня есть Log таблица с 60M записями в ней, и я пытаюсь выполнить тестовые запросы к полю UserId с индексом. К сожалению, это очень медленно, сканирование занимает около 13 секунд . Я размещаю базу данных на DO с 8 ГБ ОЗУ, 4 ЦП и SSD.

Структура таблицы

create table "Log"
(
    "Id"                    integer generated by default as identity
        constraint "PK_Log"
            primary key,
    "Result"                boolean   not null,
    "DateStamp"             timestamp not null,
    "ConversionTime"        integer   not null,
    "ConversionCost"        integer   not null,
    "Converter"             text,
    "SourceFileFormat"      text,
    "DestinationFileFormat" text,
    "Agent"                 text,
    "AgentVersion"          text,
    "Os"                    text,
    "OsVersion"             text,
    "Ip"                    text,
    "Source"                text,
    "Error"                 text,
    "UserId"                integer   not null
        constraint "FK_Log_User_UserId"
            references "User"
            on delete cascade,
    "TokenId"               text
);

alter table "Log"
    owner to doadmin;

create index "IX_Log_ConversionCost"
    on "Log" ("ConversionCost");

create index "IX_Log_DateStamp"
    on "Log" ("DateStamp");

create index "IX_Log_Result"
    on "Log" ("Result");

create index "IX_Log_UserId"
    on "Log" ("UserId");

create index "IX_Log_UserId-all"
    on "Log" ("UserId", "DateStamp", "Result", "ConversionCost");

Фактический тестовый SQL запрос, который занимает больше времени, чем должно быть

set random_page_cost=1;
set seq_page_cost=4;

EXPLAIN (buffers, analyse )
SELECT
    "UserId"
FROM "Log"
WHERE "UserId" = 7841

GROUP BY 1;

План выполнения

Group  (cost=1000.57..1917660.13 rows=626 width=4) (actual time=2.404..13845.150 rows=1 loops=1)
  Group Key: "UserId"
  Buffers: shared hit=6038 read=857126
  ->  Gather  (cost=1000.57..1917657.00 rows=1252 width=4) (actual time=2.389..13895.478 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=6038 read=857126
        ->  Group  (cost=0.56..1916531.80 rows=626 width=4) (actual time=1.134..13841.871 rows=1 loops=3)
              Group Key: "UserId"
              Buffers: shared hit=6038 read=857126
              ->  Parallel Index Only Scan using "IX_Log_UserId" on "Log"  (cost=0.56..1913347.27 rows=1273813 width=4) (actual time=1.132..13721.591 rows=1064916 loops=3)
                    Index Cond: ("UserId" = 7841)
                    Heap Fetches: 3194749
                    Buffers: shared hit=6038 read=857126
Planning Time: 0.097 ms
Execution Time: 13895.539 ms

https://explain.depesz.com/s/t8OTa

Это с пояснением времени

set random_page_cost=1;
set seq_page_cost=1;

explain (analyze, buffers, timing)
SELECT
    "UserId"
FROM "Log"
WHERE "UserId" = 7841

GROUP BY 1;

Group  (cost=33076.49..1327372.02 rows=626 width=4) (actual time=769.610..14062.469 rows=1 loops=1)
  Group Key: "UserId"
  Buffers: shared hit=3541 read=853584
  ->  Gather  (cost=33076.49..1327368.89 rows=1252 width=4) (actual time=769.167..14139.210 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=3541 read=853584
        ->  Group  (cost=32076.49..1326243.69 rows=626 width=4) (actual time=759.835..14043.414 rows=1 loops=3)
              Group Key: "UserId"
              Buffers: shared hit=3541 read=853584
              ->  Parallel Bitmap Heap Scan on "Log"  (cost=32076.49..1323059.16 rows=1273813 width=4) (actual time=759.611..13614.695 rows=1064916 loops=3)
                    Recheck Cond: ("UserId" = 7841)
                    Heap Blocks: exact=284018
                    Buffers: shared hit=3541 read=853584
                    ->  Bitmap Index Scan on "IX_Log_UserId"  (cost=0.00..31312.20 rows=3057152 width=0) (actual time=454.977..454.977 rows=3194749 loops=1)
                          Index Cond: ("UserId" = 7841)
                          Buffers: shared hit=3541 read=5192
Planning Time: 0.140 ms
Execution Time: 14141.648 ms

https://explain.depesz.com/s/xEtn

Еще один «реальный» запрос с таким же медленным результатом

set random_page_cost=1;
set seq_page_cost=1;

explain (analyze, buffers, timing)
SELECT
    "Result"
FROM "Log"
WHERE "UserId" = 7841

GROUP BY 1;

План выполнения

Group  (cost=1595170.54..1595171.02 rows=2 width=1) (actual time=29233.895..29233.899 rows=2 loops=1)
  Group Key: "Result"
  Buffers: shared hit=72330 read=362500
  ->  Gather Merge  (cost=1595170.54..1595171.01 rows=4 width=1) (actual time=29233.894..29286.971 rows=6 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=199587 read=1075489
        ->  Sort  (cost=1594170.52..1594170.53 rows=2 width=1) (actual time=29227.651..29227.651 rows=2 loops=3)
              Sort Key: "Result"
              Sort Method: quicksort  Memory: 25kB
              Worker 0:  Sort Method: quicksort  Memory: 25kB
              Worker 1:  Sort Method: quicksort  Memory: 25kB
              Buffers: shared hit=199587 read=1075489
              ->  Partial HashAggregate  (cost=1594170.49..1594170.51 rows=2 width=1) (actual time=29227.616..29227.617 rows=2 loops=3)
                    Group Key: "Result"
                    Buffers: shared hit=199571 read=1075489
                    ->  Parallel Seq Scan on "Log"  (cost=0.00..1590985.96 rows=1273813 width=1) (actual time=0.220..28997.691 rows=1064916 loops=3)
                          Filter: ("UserId" = 7841)
                          Rows Removed by Filter: 19154346
                          Buffers: shared hit=199571 read=1075489
Planning Time: 3.910 ms
Execution Time: 29287.100 ms

https://explain.depesz.com/s/tjoK

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...