Несколько полных соединений в Postgres работают медленно - PullRequest
0 голосов
/ 05 апреля 2010

У меня есть программа для использования базы данных IMDB, и у меня очень низкая производительность по моему запросу. Похоже, что он не использует мое условие «где», пока не материализует все. Я искал подсказки, чтобы использовать, но ничего, кажется, не работает. Вот мой запрос:

SELECT * 
 FROM name as n1
  FULL JOIN aka_name
ON n1.id = aka_name.person_id
 FULL JOIN cast_info as t2
ON n1.id = t2.person_id
 FULL JOIN person_info as t3
ON n1.id = t3.person_id
 FULL JOIN char_name as t4
ON t2.person_role_id = t4.id
 FULL JOIN role_type as t5
ON t2.role_id = t5.id
 FULL JOIN title as t6
ON t2.movie_id = t6.id
 FULL JOIN aka_title as t7 
ON t6.id = t7.movie_id
 FULL JOIN complete_cast as t8
ON t6.id = t8.movie_id
 FULL JOIN kind_type as t9
ON t6.kind_id = t9.id
 FULL JOIN movie_companies as t10
ON t6.id = t10.movie_id
 FULL JOIN movie_info as t11
ON t6.id = t11.movie_id
 FULL JOIN movie_info_idx as t19
ON t6.id = t19.movie_id
 FULL JOIN movie_keyword as t12
ON t6.id = t12.movie_id
 FULL JOIN movie_link as t13
ON t6.id = t13.linked_movie_id
 FULL JOIN link_type as t14
ON t13.link_type_id = t14.id
 FULL JOIN keyword as t15
ON t12.keyword_id = t15.id
 FULL JOIN company_name as t16
ON t10.company_id = t16.id
 FULL JOIN company_type as t17
ON t10.company_type_id = t17.id
 FULL JOIN comp_cast_type as t18
ON t8.status_id = t18.id
 WHERE n1.id = 2003

Очень таблицы связаны друг с другом в соединении через ограничения внешнего ключа и имеют индексы для всех упомянутых столбцов.

Подробности плана запроса:

"Hash Left Join  (cost=5838187.01..13756845.07 rows=15579622 width=835) (actual time=146879.213..146891.861 rows=20 loops=1)"
"  Hash Cond: (t8.status_id = t18.id)"
"  ->  Hash Left Join  (cost=5838185.92..13542624.18 rows=15579622 width=822) (actual time=146879.199..146891.833 rows=20 loops=1)"
"        Hash Cond: (t10.company_type_id = t17.id)"
"        ->  Hash Left Join  (cost=5838184.83..13328403.29 rows=15579622 width=797) (actual time=146879.165..146891.781 rows=20 loops=1)"
"              Hash Cond: (t10.company_id = t16.id)"
"              ->  Hash Left Join  (cost=5828372.95..10061752.03 rows=15579622 width=755) (actual time=146426.483..146429.756 rows=20 loops=1)"
"                    Hash Cond: (t12.keyword_id = t15.id)"
"                    ->  Hash Left Join  (cost=5825164.23..6914088.45 rows=15579622 width=731) (actual time=146372.411..146372.529 rows=20 loops=1)"
"                          Hash Cond: (t13.link_type_id = t14.id)"
"                          ->  Merge Left Join  (cost=5825162.82..6699867.24 rows=15579622 width=715) (actual time=146372.366..146372.472 rows=20 loops=1)"
"                                Merge Cond: (t6.id = t13.linked_movie_id)"
"                                ->  Merge Left Join  (cost=5684009.29..6378956.77 rows=15579622 width=699) (actual time=144019.620..144019.711 rows=20 loops=1)"
"                                      Merge Cond: (t6.id = t12.movie_id)"
"                                      ->  Merge Left Join  (cost=5182403.90..5622400.75 rows=8502523 width=687) (actual time=136849.731..136849.809 rows=20 loops=1)"
"                                            Merge Cond: (t6.id = t19.movie_id)"
"                                            ->  Merge Left Join  (cost=4974472.00..5315778.48 rows=8502523 width=637) (actual time=134972.032..134972.099 rows=20 loops=1)"
"                                                  Merge Cond: (t6.id = t11.movie_id)"
"                                                  ->  Merge Left Join  (cost=1830064.81..2033131.89 rows=1341632 width=561) (actual time=63784.035..63784.062 rows=2 loops=1)"
"                                                        Merge Cond: (t6.id = t10.movie_id)"
"                                                        ->  Nested Loop Left Join  (cost=1417360.29..1594294.02 rows=1044480 width=521) (actual time=59279.246..59279.264 rows=1 loops=1)"
"                                                              Join Filter: (t6.kind_id = t9.id)"
"                                                              ->  Merge Left Join  (cost=1417359.22..1429787.34 rows=1044480 width=507) (actual time=59279.222..59279.224 rows=1 loops=1)"
"                                                                    Merge Cond: (t6.id = t8.movie_id)"
"                                                                    ->  Merge Left Join  (cost=1405731.84..1414378.65 rows=1044480 width=491) (actual time=59121.773..59121.775 rows=1 loops=1)"
"                                                                          Merge Cond: (t6.id = t7.movie_id)"
"                                                                          ->  Sort  (cost=1346206.04..1348817.24 rows=1044480 width=416) (actual time=58095.230..58095.231 rows=1 loops=1)"
"                                                                                Sort Key: t6.id"
"                                                                                Sort Method:  quicksort  Memory: 17kB"
"                                                                                ->  Hash Left Join  (cost=172406.29..456387.53 rows=1044480 width=416) (actual time=57969.371..58095.208 rows=1 loops=1)"
"                                                                                      Hash Cond: (t2.movie_id = t6.id)"
"                                                                                      ->  Hash Left Join  (cost=104700.38..256885.82 rows=1044480 width=358) (actual time=49981.493..50006.303 rows=1 loops=1)"
"                                                                                            Hash Cond: (t2.role_id = t5.id)"
"                                                                                            ->  Hash Left Join  (cost=104699.11..242522.95 rows=1044480 width=343) (actual time=49981.441..50006.250 rows=1 loops=1)"
"                                                                                                  Hash Cond: (t2.person_role_id = t4.id)"
"                                                                                                  ->  Hash Left Join  (cost=464.96..12283.95 rows=1044480 width=269) (actual time=0.071..0.087 rows=1 loops=1)"
"                                                                                                        Hash Cond: (n1.id = t3.person_id)"
"                                                                                                        ->  Nested Loop Left Join  (cost=0.00..49.39 rows=7680 width=160) (actual time=0.051..0.066 rows=1 loops=1)"
"                                                                                                              ->  Nested Loop Left Join  (cost=0.00..17.04 rows=3 width=119) (actual time=0.038..0.041 rows=1 loops=1)"
"                                                                                                                    ->  Index Scan using name_pkey on name n1  (cost=0.00..8.68 rows=1 width=39) (actual time=0.022..0.024 rows=1 loops=1)"
"                                                                                                                          Index Cond: (id = 2003)"
"                                                                                                                    ->  Index Scan using aka_name_idx_person on aka_name  (cost=0.00..8.34 rows=1 width=80) (actual time=0.010..0.010 rows=0 loops=1)"
"                                                                                                                          Index Cond: ((aka_name.person_id = 2003) AND (n1.id = aka_name.person_id))"
"                                                                                                              ->  Index Scan using cast_info_idx_pid on cast_info t2  (cost=0.00..10.77 rows=1 width=41) (actual time=0.011..0.020 rows=1 loops=1)"
"                                                                                                                    Index Cond: ((t2.person_id = 2003) AND (n1.id = t2.person_id))"
"                                                                                                        ->  Hash  (cost=463.26..463.26 rows=136 width=109) (actual time=0.010..0.010 rows=0 loops=1)"
"                                                                                                              ->  Index Scan using person_info_idx_pid on person_info t3  (cost=0.00..463.26 rows=136 width=109) (actual time=0.009..0.009 rows=0 loops=1)"
"                                                                                                                    Index Cond: (person_id = 2003)"
"                                                                                                  ->  Hash  (cost=42697.62..42697.62 rows=2442362 width=74) (actual time=49305.872..49305.872 rows=2442362 loops=1)"
"                                                                                                        ->  Seq Scan on char_name t4  (cost=0.00..42697.62 rows=2442362 width=74) (actual time=14.066..22775.087 rows=2442362 loops=1)"
"                                                                                            ->  Hash  (cost=1.12..1.12 rows=12 width=15) (actual time=0.024..0.024 rows=12 loops=1)"
"                                                                                                  ->  Seq Scan on role_type t5  (cost=0.00..1.12 rows=12 width=15) (actual time=0.012..0.014 rows=12 loops=1)"
"                                                                                      ->  Hash  (cost=31134.07..31134.07 rows=1573507 width=58) (actual time=7841.225..7841.225 rows=1573507 loops=1)"
"                                                                                            ->  Seq Scan on title t6  (cost=0.00..31134.07 rows=1573507 width=58) (actual time=21.507..2799.443 rows=1573507 loops=1)"
"                                                                          ->  Materialize  (cost=59525.80..63203.88 rows=294246 width=75) (actual time=812.376..984.958 rows=192075 loops=1)"
"                                                                                ->  Sort  (cost=59525.80..60261.42 rows=294246 width=75) (actual time=812.363..922.452 rows=192075 loops=1)"
"                                                                                      Sort Key: t7.movie_id"
"                                                                                      Sort Method:  external merge  Disk: 24880kB"
"                                                                                      ->  Seq Scan on aka_title t7  (cost=0.00..6646.46 rows=294246 width=75) (actual time=24.652..164.822 rows=294246 loops=1)"
"                                                                    ->  Materialize  (cost=11627.38..12884.43 rows=100564 width=16) (actual time=123.819..149.086 rows=41907 loops=1)"
"                                                                          ->  Sort  (cost=11627.38..11878.79 rows=100564 width=16) (actual time=123.807..138.530 rows=41907 loops=1)"
"                                                                                Sort Key: t8.movie_id"
"                                                                                Sort Method:  external merge  Disk: 3136kB"
"                                                                                ->  Seq Scan on complete_cast t8  (cost=0.00..1549.64 rows=100564 width=16) (actual time=0.013..10.744 rows=100564 loops=1)"
"                                                              ->  Materialize  (cost=1.08..1.15 rows=7 width=14) (actual time=0.016..0.029 rows=7 loops=1)"
"                                                                    ->  Seq Scan on kind_type t9  (cost=0.00..1.07 rows=7 width=14) (actual time=0.011..0.013 rows=7 loops=1)"
"                                                        ->  Materialize  (cost=412704.52..437969.09 rows=2021166 width=40) (actual time=3420.356..4278.545 rows=1028995 loops=1)"
"                                                              ->  Sort  (cost=412704.52..417757.43 rows=2021166 width=40) (actual time=3420.349..3953.483 rows=1028995 loops=1)"
"                                                                    Sort Key: t10.movie_id"
"                                                                    Sort Method:  external merge  Disk: 90960kB"
"                                                                    ->  Seq Scan on movie_companies t10  (cost=0.00..35214.66 rows=2021166 width=40) (actual time=13.271..566.893 rows=2021166 loops=1)"
"                                                  ->  Materialize  (cost=3144407.19..3269057.42 rows=9972019 width=76) (actual time=65485.672..70083.219 rows=5039009 loops=1)"
"                                                        ->  Sort  (cost=3144407.19..3169337.23 rows=9972019 width=76) (actual time=65485.667..68385.550 rows=5038999 loops=1)"
"                                                              Sort Key: t11.movie_id"
"                                                              Sort Method:  external merge  Disk: 735512kB"
"                                                              ->  Seq Scan on movie_info t11  (cost=0.00..212815.19 rows=9972019 width=76) (actual time=15.750..15715.608 rows=9972019 loops=1)"
"                                            ->  Materialize  (cost=207925.01..219867.92 rows=955433 width=50) (actual time=1483.989..1785.636 rows=429401 loops=1)"
"                                                  ->  Sort  (cost=207925.01..210313.59 rows=955433 width=50) (actual time=1483.983..1654.165 rows=429401 loops=1)"
"                                                        Sort Key: t19.movie_id"
"                                                        Sort Method:  external merge  Disk: 31720kB"
"                                                        ->  Seq Scan on movie_info_idx t19  (cost=0.00..15047.33 rows=955433 width=50) (actual time=7.284..221.597 rows=955433 loops=1)"
"                                      ->  Materialize  (cost=501605.39..537645.64 rows=2883220 width=12) (actual time=5823.040..6868.242 rows=1597396 loops=1)"
"                                            ->  Sort  (cost=501605.39..508813.44 rows=2883220 width=12) (actual time=5823.026..6477.517 rows=1597396 loops=1)"
"                                                  Sort Key: t12.movie_id"
"                                                  Sort Method:  external merge  Disk: 78888kB"
"                                                  ->  Seq Scan on movie_keyword t12  (cost=0.00..44417.20 rows=2883220 width=12) (actual time=11.672..839.498 rows=2883220 loops=1)"
"                                ->  Materialize  (cost=141143.93..152995.81 rows=948150 width=16) (actual time=1916.356..2253.004 rows=478358 loops=1)"
"                                      ->  Sort  (cost=141143.93..143514.31 rows=948150 width=16) (actual time=1916.344..2125.698 rows=478358 loops=1)"
"                                            Sort Key: t13.linked_movie_id"
"                                            Sort Method:  external merge  Disk: 29632kB"
"                                            ->  Seq Scan on movie_link t13  (cost=0.00..14607.50 rows=948150 width=16) (actual time=27.610..297.962 rows=948150 loops=1)"
"                          ->  Hash  (cost=1.18..1.18 rows=18 width=16) (actual time=0.020..0.020 rows=18 loops=1)"
"                                ->  Seq Scan on link_type t14  (cost=0.00..1.18 rows=18 width=16) (actual time=0.010..0.012 rows=18 loops=1)"
"                    ->  Hash  (cost=1537.10..1537.10 rows=91010 width=24) (actual time=54.055..54.055 rows=91010 loops=1)"
"                          ->  Seq Scan on keyword t15  (cost=0.00..1537.10 rows=91010 width=24) (actual time=0.006..14.703 rows=91010 loops=1)"
"              ->  Hash  (cost=4585.61..4585.61 rows=245461 width=42) (actual time=445.269..445.269 rows=245461 loops=1)"
"                    ->  Seq Scan on company_name t16  (cost=0.00..4585.61 rows=245461 width=42) (actual time=12.037..309.961 rows=245461 loops=1)"
"        ->  Hash  (cost=1.04..1.04 rows=4 width=25) (actual time=0.013..0.013 rows=4 loops=1)"
"              ->  Seq Scan on company_type t17  (cost=0.00..1.04 rows=4 width=25) (actual time=0.009..0.010 rows=4 loops=1)"
"  ->  Hash  (cost=1.04..1.04 rows=4 width=13) (actual time=0.006..0.006 rows=4 loops=1)"
"        ->  Seq Scan on comp_cast_type t18  (cost=0.00..1.04 rows=4 width=13) (actual time=0.002..0.003 rows=4 loops=1)"
"Total runtime: 147055.016 ms"

Есть ли способ принудительно вызвать name.id = 2003, прежде чем он попытается объединить все таблицы вместе? Как видите, конечный результат - 4 кортежа, но, похоже, это должно быть быстрое соединение с использованием доступного индекса после того, как оно ограничено предложением name, хотя и очень сложным.

1 Ответ

1 голос
/ 05 апреля 2010

Будет сложно оптимизировать запрос с таким количеством FULL JOIN.Вы уверены, что это действительно то, что вы имеете в виду - а не ВНУТРЕННЕЕ СОЕДИНЕНИЕ, или, по крайней мере, ЛЕВОЕ СОЕДИНЕНИЕ?FULL JOIN не может быть переупорядочен, и есть много других оптимизаций, которые также невозможны, поскольку они предназначены для возврата вещей независимо от того, в какой таблице они находятся. Не рассматривая схему конкретно, но из запроса это выглядитэто может быть как минимум левое соединение в большинстве из этих мест.Вы действительно хотите строки из char_name, даже если в имени нет соответствующей строки и т. Д.?

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...