Анализ воронки SQL - PullRequest
       1

Анализ воронки SQL

0 голосов
/ 02 апреля 2020

У меня есть таблица событий просмотра страниц, и в настоящее время я получаю из них данные последовательности. Мой текущий запрос не очень эффективен, если у меня более 4 шагов последовательности.

Например: я хочу воронку со следующими событиями истории в правильном порядке.

/ - > /pricing -> /docs -> /contact

Выходные данные должны быть: 500 уникальных, например, 300, 200, 10

Моя таблица выглядит следующим образом:

CREATE TABLE user_history (
    url character varying NOT NULL, -- paths like / /pricing ...
    time timestamp without time zone DEFAULT now(),
    "trackedUserId" character varying,
    "projectId" character varying
);
+----+-----------+---------------+------------------------+
| ID |   url     | trackedUserId |          time          |
+----+-----------+---------------+------------------------+
|  1 | /         |      1        | 2017-09-22 13:47:00+00 |
|  2 | /         |      2        | 2017-09-22 13:48:00+00 |
|  3 | /pricing  |      2        | 2017-09-22 13:49:00+00 |
|  4 | /pricing  |      2        | 2017-09-22 13:49:00+00 |
+----+-----------+---------------+------------------------+

Мой текущий запрос уже возвращает правильные результаты:

select count(distinct d1."trackedUserId") as e1,
       count(distinct d2."trackedUserId") as e2,
       count(distinct d3."trackedUserId") as e3,
       count(distinct d4."trackedUserId") as e4
from user_history d1
left join user_history d2 on d2."trackedUserId" = d1."trackedUserId" and d2.time > d1.time and d2.url = '/pricing' -- second step
left join user_history d3 on d3."trackedUserId" = d2."trackedUserId" and d3.time > d2.time and d3.url = '/' -- third step
left join user_history d4 on d4."trackedUserId" = d3."trackedUserId" and d4.time > d3.time and d4.url = '/contact' -- fourth step
where d1.url = '/'; -- start url

Результат выглядит так:

Result

Запрос анализа выходных данных:

QUERY PLAN
Aggregate  (cost=730.08..730.09 rows=1 width=32) (actual time=13519.105..13519.105 rows=1 loops=1)
  ->  Hash Left Join  (cost=405.21..725.66 rows=442 width=124) (actual time=28.963..1362.802 rows=4015013 loops=1)
"        Hash Cond: ((d1.""trackedUserId"")::text = (d2.""trackedUserId"")::text)"
"        Join Filter: (d2.""time"" > d1.""time"")"
        Rows Removed by Join Filter: 4335538
        ->  Append  (cost=0.14..312.85 rows=442 width=39) (actual time=0.065..1.776 rows=2087 loops=1)
"              ->  Index Scan using ""_hyper_15_6_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_6_chunk d1  (cost=0.14..1.60 rows=1 width=39) (actual time=0.018..0.018 rows=0 loops=1)"
"                    Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                    Filter: ((url)::text = '/'::text)"
"              ->  Index Scan using ""_hyper_15_7_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_7_chunk d1_1  (cost=0.14..1.60 rows=1 width=39) (actual time=0.017..0.017 rows=0 loops=1)"
"                    Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                    Filter: ((url)::text = '/'::text)"
"              ->  Index Scan using ""_hyper_15_8_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_8_chunk d1_2  (cost=0.27..4.76 rows=20 width=39) (actual time=0.028..0.041 rows=29 loops=1)"
"                    Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                    Filter: ((url)::text = '/'::text)"
"              ->  Index Scan using ""_hyper_15_9_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_9_chunk d1_3  (cost=0.27..5.50 rows=12 width=39) (actual time=0.015..0.022 rows=26 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_10_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_10_chunk d1_4  (cost=0.28..15.79 rows=46 width=39) (actual time=0.011..0.040 rows=73 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_12_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_12_chunk d1_5  (cost=0.28..24.94 rows=39 width=39) (actual time=0.012..0.046 rows=77 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_14_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_14_chunk d1_6  (cost=0.42..59.38 rows=58 width=39) (actual time=0.017..0.347 rows=600 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_16_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_16_chunk d1_7  (cost=0.42..32.87 rows=31 width=39) (actual time=0.020..0.281 rows=305 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_18_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_18_chunk d1_8  (cost=0.42..63.09 rows=61 width=39) (actual time=0.237..0.525 rows=466 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_20_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_20_chunk d1_9  (cost=0.28..91.82 rows=152 width=39) (actual time=0.011..0.235 rows=462 loops=1)"
"                    Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"              ->  Index Scan using ""_hyper_15_22_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_22_chunk d1_10  (cost=0.28..9.29 rows=21 width=39) (actual time=0.011..0.034 rows=49 loops=1)"
"                    Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                    Filter: ((url)::text = '/'::text)"
                    Rows Removed by Filter: 23
        ->  Hash  (cost=404.47..404.47 rows=48 width=101) (actual time=28.764..28.764 rows=27911 loops=1)
              Buckets: 16384 (originally 1024)  Batches: 256 (originally 1)  Memory Usage: 2206kB
              ->  Hash Left Join  (cost=83.71..404.47 rows=48 width=101) (actual time=3.161..17.770 rows=27911 loops=1)
"                    Hash Cond: ((d3.""trackedUserId"")::text = (d4.""trackedUserId"")::text)"
"                    Join Filter: (d4.""time"" > d3.""time"")"
                    ->  Hash Right Join  (cost=59.46..379.91 rows=48 width=78) (actual time=3.015..12.950 rows=27911 loops=1)
"                          Hash Cond: ((d3.""trackedUserId"")::text = (d2.""trackedUserId"")::text)"
"                          Join Filter: (d3.""time"" > d2.""time"")"
                          Rows Removed by Join Filter: 48527
                          ->  Append  (cost=0.14..312.85 rows=442 width=39) (actual time=0.030..2.306 rows=2087 loops=1)
"                                ->  Index Scan using ""_hyper_15_6_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_6_chunk d3  (cost=0.14..1.60 rows=1 width=39) (actual time=0.007..0.007 rows=0 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/'::text)"
"                                ->  Index Scan using ""_hyper_15_7_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_7_chunk d3_1  (cost=0.14..1.60 rows=1 width=39) (actual time=0.009..0.009 rows=0 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/'::text)"
"                                ->  Index Scan using ""_hyper_15_8_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_8_chunk d3_2  (cost=0.27..4.76 rows=20 width=39) (actual time=0.012..0.021 rows=29 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/'::text)"
"                                ->  Index Scan using ""_hyper_15_9_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_9_chunk d3_3  (cost=0.27..5.50 rows=12 width=39) (actual time=0.012..0.018 rows=26 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_10_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_10_chunk d3_4  (cost=0.28..15.79 rows=46 width=39) (actual time=0.021..0.057 rows=73 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_12_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_12_chunk d3_5  (cost=0.28..24.94 rows=39 width=39) (actual time=0.018..0.066 rows=77 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_14_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_14_chunk d3_6  (cost=0.42..59.38 rows=58 width=39) (actual time=0.020..0.491 rows=600 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_16_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_16_chunk d3_7  (cost=0.42..32.87 rows=31 width=39) (actual time=0.032..0.667 rows=305 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_18_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_18_chunk d3_8  (cost=0.42..63.09 rows=61 width=39) (actual time=0.026..0.435 rows=466 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_20_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_20_chunk d3_9  (cost=0.28..91.82 rows=152 width=39) (actual time=0.016..0.328 rows=462 loops=1)"
"                                      Index Cond: (((url)::text = '/'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_22_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_22_chunk d3_10  (cost=0.28..9.29 rows=21 width=39) (actual time=0.020..0.048 rows=49 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/'::text)"
                                      Rows Removed by Filter: 23
                          ->  Hash  (cost=58.71..58.71 rows=48 width=39) (actual time=1.978..1.979 rows=967 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 76kB
                                ->  Append  (cost=0.14..58.71 rows=48 width=39) (actual time=0.094..1.724 rows=967 loops=1)
"                                      ->  Index Scan using ""_hyper_15_6_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_6_chunk d2  (cost=0.14..1.60 rows=1 width=39) (actual time=0.006..0.006 rows=0 loops=1)"
"                                            Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                            Filter: ((url)::text = '/pricing'::text)"
"                                      ->  Index Scan using ""_hyper_15_7_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_7_chunk d2_1  (cost=0.14..1.60 rows=1 width=39) (actual time=0.007..0.007 rows=0 loops=1)"
"                                            Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                            Filter: ((url)::text = '/pricing'::text)"
"                                      ->  Index Scan using ""_hyper_15_8_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_8_chunk d2_2  (cost=0.27..2.29 rows=1 width=39) (actual time=0.019..0.019 rows=0 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_9_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_9_chunk d2_3  (cost=0.27..2.29 rows=1 width=39) (actual time=0.018..0.018 rows=0 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_10_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_10_chunk d2_4  (cost=0.28..3.29 rows=2 width=39) (actual time=0.021..0.021 rows=0 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_12_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_12_chunk d2_5  (cost=0.28..9.10 rows=8 width=39) (actual time=0.022..0.063 rows=28 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_14_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_14_chunk d2_6  (cost=0.42..3.45 rows=2 width=39) (actual time=0.028..0.311 rows=180 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_16_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_16_chunk d2_7  (cost=0.42..2.44 rows=1 width=39) (actual time=0.021..0.454 rows=160 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_18_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_18_chunk d2_8  (cost=0.42..2.44 rows=1 width=39) (actual time=0.030..0.406 rows=310 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_20_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_20_chunk d2_9  (cost=0.28..25.68 rows=27 width=39) (actual time=0.035..0.291 rows=266 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                      ->  Index Scan using ""_hyper_15_22_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_22_chunk d2_10  (cost=0.28..4.27 rows=3 width=39) (actual time=0.023..0.057 rows=23 loops=1)"
"                                            Index Cond: (((url)::text = '/pricing'::text) AND ((""projectId"")::text = 'splitbee'::text))"
                    ->  Hash  (cost=24.11..24.11 rows=11 width=39) (actual time=0.128..0.129 rows=0 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 8kB
                          ->  Append  (cost=0.14..24.11 rows=11 width=39) (actual time=0.128..0.128 rows=0 loops=1)
"                                ->  Index Scan using ""_hyper_15_6_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_6_chunk d4  (cost=0.14..1.60 rows=1 width=39) (actual time=0.009..0.009 rows=0 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/contact'::text)"
"                                ->  Index Scan using ""_hyper_15_7_chunk_IDX_1607e47604c0536bca9ce4c16f"" on _hyper_15_7_chunk d4_1  (cost=0.14..1.60 rows=1 width=39) (actual time=0.005..0.005 rows=0 loops=1)"
"                                      Index Cond: ((""projectId"")::text = 'splitbee'::text)"
"                                      Filter: ((url)::text = '/contact'::text)"
"                                ->  Index Scan using ""_hyper_15_8_chunk_IDX_d44f2267460950e92bc03af849"" on _hyper_15_8_chunk d4_2  (cost=0.15..2.17 rows=1 width=39) (actual time=0.013..0.013 rows=0 loops=1)"
"                                      Index Cond: ((url)::text = '/contact'::text)"
"                                      Filter: ((""projectId"")::text = 'splitbee'::text)"
"                                ->  Index Scan using ""_hyper_15_9_chunk_IDX_d44f2267460950e92bc03af849"" on _hyper_15_9_chunk d4_3  (cost=0.15..2.17 rows=1 width=39) (actual time=0.008..0.008 rows=0 loops=1)"
"                                      Index Cond: ((url)::text = '/contact'::text)"
"                                      Filter: ((""projectId"")::text = 'splitbee'::text)"
"                                ->  Index Scan using ""_hyper_15_10_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_10_chunk d4_4  (cost=0.28..2.30 rows=1 width=39) (actual time=0.019..0.019 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_12_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_12_chunk d4_5  (cost=0.28..2.30 rows=1 width=39) (actual time=0.008..0.008 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_14_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_14_chunk d4_6  (cost=0.42..2.44 rows=1 width=39) (actual time=0.015..0.015 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_16_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_16_chunk d4_7  (cost=0.42..2.44 rows=1 width=39) (actual time=0.019..0.019 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_18_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_18_chunk d4_8  (cost=0.42..2.44 rows=1 width=39) (actual time=0.011..0.011 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_20_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_20_chunk d4_9  (cost=0.28..2.30 rows=1 width=39) (actual time=0.011..0.011 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
"                                ->  Index Scan using ""_hyper_15_22_chunk_IDX_3bdd006d196d60162331322b7a"" on _hyper_15_22_chunk d4_10  (cost=0.28..2.29 rows=1 width=39) (actual time=0.007..0.007 rows=0 loops=1)"
"                                      Index Cond: (((url)::text = '/contact'::text) AND ((""projectId"")::text = 'splitbee'::text))"
Planning Time: 7.771 ms
Execution Time: 13519.888 ms

У меня есть следующие индексы: indexes

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...