Определение корреляции между агрегированными данными и неагрегированными данными - PullRequest
0 голосов
/ 08 декабря 2018

У меня есть набор данных, который по сути представляет собой список списков, созданных в результате вывода SQL-запроса.Вот как это выглядит

[[(datetime.datetime(2017, 12, 1, 0, 0), Decimal('7.9618320610687023')), (datetime.datetime(2018, 1, 1, 0, 0), Decimal('3.8426966292134831')), (datetime.datetime(2018, 2, 1, 0, 0), Decimal('4.4876543209876543')), (datetime.datetime(2018, 3, 1, 0, 0), Decimal('4.7269372693726937')), (datetime.datetime(2018, 4, 1, 0, 0), Decimal('5.3849765258215962')), (datetime.datetime(2018, 5, 1, 0, 0), Decimal('4.0217391304347826')), (datetime.datetime(2018, 6, 1, 0, 0), Decimal('4.1186440677966102')), (datetime.datetime(2018, 7, 1, 0, 0), Decimal('6.2187500000000000')), (datetime.datetime(2018, 8, 1, 0, 0), Decimal('3.2826086956521739')), (datetime.datetime(2018, 9, 1, 0, 0), Decimal('4.4661654135338346')), (datetime.datetime(2018, 10, 1, 0, 0), Decimal('4.9191176470588235')), (datetime.datetime(2018, 11, 1, 0, 0), Decimal('4.0491803278688525')), (datetime.datetime(2018, 12, 1, 0, 0), Decimal('5.3090909090909091'))], [(datetime.datetime(2017, 12, 1, 0, 0), 14.2151145038168), (datetime.datetime(2018, 1, 1, 0, 0), 12.9982584269663), (datetime.datetime(2018, 2, 1, 0, 0), 13.46), (datetime.datetime(2018, 3, 1, 0, 0), 13.0539852398524), (datetime.datetime(2018, 4, 1, 0, 0), 12.9493896713615), (datetime.datetime(2018, 5, 1, 0, 0), 13.115652173913), (datetime.datetime(2018, 6, 1, 0, 0), 12.8800564971751), (datetime.datetime(2018, 7, 1, 0, 0), 13.318125), (datetime.datetime(2018, 8, 1, 0, 0), 13.6523913043478), (datetime.datetime(2018, 9, 1, 0, 0), 14.0972180451128), (datetime.datetime(2018, 10, 1, 0, 0), 14.6723529411765), (datetime.datetime(2018, 11, 1, 0, 0), 14.936393442623), (datetime.datetime(2018, 12, 1, 0, 0), 15.9845454545455)]]

В основном он содержит два списка, каждый из которых содержит столбец даты и метрики.Мне нужно извлечь значения столбцов метрик для каждого из списков и найти корреляцию между ними.

Здесь две метрики quantity и unitprice, и запрос в основном состоит в том, чтобы выяснить monthly average quantity and unit price for the last 1 year.

Вот как выглядит график

enter image description here

Итак, вот что я делаю, чтобы получить Pearson и Spearman коэффициент в пандах

import pandas as pd
import datetime
from decimal import Decimal

# contains date and average quantity values
data1 = data[0]
# contains date and average unitprice values
data2 = data[1]

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

pearson_coeff = df1.iloc[:,-1].astype('float64').corr(df2.iloc[:,-1].astype('float64'))

spearman_coeff = df1.iloc[:,-1].astype('float64').corr(df2.iloc[:,-1].astype('float64'),method="spearman", min_periods=1)

Я получаю значение pearson_coeff как 0.3416, а значение spearman_coeff как 0.2802.

Теперь я где-то читал, что не очень хорошая идея найтивзаимосвязи на агрегированных данных.Итак, я сделал отдельный SQL-запрос по каждой из метрик, но на этот раз без агрегатов. Вот как это выглядит

[[(datetime.datetime(2017, 12, 1, 0, 0), 272), (datetime.datetime(2017, 12, 1, 0, 0), -16), (datetime.datetime(2017, 12, 1, 0, 0), 80), (datetime.datetime(2017, 12, 1, 0, 0), 38), (datetime.datetime(2017, 12, 1, 0, 0), -2), (datetime.datetime(2017, 12, 1, 0, 0), 79), (datetime.datetime(2017, 12, 1, 0, 0), -10), (datetime.datetime(2017, 12, 1, 0, 0), 12), (datetime.datetime(2017, 12, 1, 0, 0), 32), (datetime.datetime(2017, 12, 1, 0, 0), -1), (datetime.datetime(2017, 12, 1, 0, 0), 1), (datetime.datetime(2017, 12, 1, 0, 0), 6), (datetime.datetime(2017, 12, 1, 0, 0), 4), (datetime.datetime(2017, 12, 1, 0, 0), -12), (datetime.datetime(2017, 12, 1, 0, 0), 2), (datetime.datetime(2017, 12, 1, 0, 0), 3), (datetime.datetime(2017, 12, 1, 0, 0), 5), (datetime.datetime(2017, 12, 1, 0, 0), 52), (datetime.datetime(2017, 12, 1, 0, 0), 16), (datetime.datetime(2018, 1, 1, 0, 0), -4), (datetime.datetime(2018, 1, 1, 0, 0), 4), (datetime.datetime(2018, 1, 1, 0, 0), 12), (datetime.datetime(2018, 1, 1, 0, 0), -23), (datetime.datetime(2018, 1, 1, 0, 0), 16), (datetime.datetime(2018, 1, 1, 0, 0), 48), (datetime.datetime(2018, 1, 1, 0, 0), 5), (datetime.datetime(2018, 1, 1, 0, 0), -1), (datetime.datetime(2018, 1, 1, 0, 0), 1), (datetime.datetime(2018, 1, 1, 0, 0), 3), (datetime.datetime(2018, 1, 1, 0, 0), 17), (datetime.datetime(2018, 1, 1, 0, 0), -7), (datetime.datetime(2018, 1, 1, 0, 0), 11), (datetime.datetime(2018, 1, 1, 0, 0), -6), (datetime.datetime(2018, 1, 1, 0, 0), 7), (datetime.datetime(2018, 1, 1, 0, 0), 10), (datetime.datetime(2018, 1, 1, 0, 0), 8), (datetime.datetime(2018, 1, 1, 0, 0), -13), (datetime.datetime(2018, 1, 1, 0, 0), -9), (datetime.datetime(2018, 1, 1, 0, 0), -3), (datetime.datetime(2018, 1, 1, 0, 0), -2), (datetime.datetime(2018, 1, 1, 0, 0), 32), (datetime.datetime(2018, 1, 1, 0, 0), 6), (datetime.datetime(2018, 1, 1, 0, 0), 2), (datetime.datetime(2018, 2, 1, 0, 0), -7), (datetime.datetime(2018, 2, 1, 0, 0), 12), (datetime.datetime(2018, 2, 1, 0, 0), 32), (datetime.datetime(2018, 2, 1, 0, 0), 3), (datetime.datetime(2018, 2, 1, 0, 0), 11), (datetime.datetime(2018, 2, 1, 0, 0), 1), (datetime.datetime(2018, 2, 1, 0, 0), -3), (datetime.datetime(2018, 2, 1, 0, 0), -2), (datetime.datetime(2018, 2, 1, 0, 0), -1), (datetime.datetime(2018, 2, 1, 0, 0), -4), (datetime.datetime(2018, 2, 1, 0, 0), 48), (datetime.datetime(2018, 2, 1, 0, 0), 4), (datetime.datetime(2018, 2, 1, 0, 0), 16), (datetime.datetime(2018, 2, 1, 0, 0), 24), (datetime.datetime(2018, 2, 1, 0, 0), -5), (datetime.datetime(2018, 2, 1, 0, 0), 72), (datetime.datetime(2018, 2, 1, 0, 0), 2), (datetime.datetime(2018, 2, 1, 0, 0), 6), (datetime.datetime(2018, 3, 1, 0, 0), -3), (datetime.datetime(2018, 3, 1, 0, 0), 8), (datetime.datetime(2018, 3, 1, 0, 0), 24), (datetime.datetime(2018, 3, 1, 0, 0), 3), (datetime.datetime(2018, 3, 1, 0, 0), 16), (datetime.datetime(2018, 3, 1, 0, 0), 150), (datetime.datetime(2018, 3, 1, 0, 0), -23), (datetime.datetime(2018, 3, 1, 0, 0), -2), (datetime.datetime(2018, 3, 1, 0, 0), 27), (datetime.datetime(2018, 3, 1, 0, 0), -9), (datetime.datetime(2018, 3, 1, 0, 0), -5), (datetime.datetime(2018, 3, 1, 0, 0), 14), (datetime.datetime(2018, 3, 1, 0, 0), 15), (datetime.datetime(2018, 3, 1, 0, 0), 48), (datetime.datetime(2018, 3, 1, 0, 0), 4), (datetime.datetime(2018, 3, 1, 0, 0), 13), (datetime.datetime(2018, 3, 1, 0, 0), 7), (datetime.datetime(2018, 3, 1, 0, 0), -7), (datetime.datetime(2018, 3, 1, 0, 0), -6), (datetime.datetime(2018, 3, 1, 0, 0), 20), (datetime.datetime(2018, 3, 1, 0, 0), 6), (datetime.datetime(2018, 3, 1, 0, 0), 10), (datetime.datetime(2018, 3, 1, 0, 0), 12), (datetime.datetime(2018, 3, 1, 0, 0), 1), (datetime.datetime(2018, 3, 1, 0, 0), 32), (datetime.datetime(2018, 3, 1, 0, 0), -1), (datetime.datetime(2018, 3, 1, 0, 0), 2), (datetime.datetime(2018, 3, 1, 0, 0), -48), (datetime.datetime(2018, 3, 1, 0, 0), -8), (datetime.datetime(2018, 3, 1, 0, 0), 5), (datetime.datetime(2018, 3, 1, 0, 0), -10), (datetime.datetime(2018, 3, 1, 0, 0), 17), (datetime.datetime(2018, 4, 1, 0, 0), 36), (datetime.datetime(2018, 4, 1, 0, 0), 4), (datetime.datetime(2018, 4, 1, 0, 0), 11), (datetime.datetime(2018, 4, 1, 0, 0), 60), (datetime.datetime(2018, 4, 1, 0, 0), 2), (datetime.datetime(2018, 4, 1, 0, 0), -3), (datetime.datetime(2018, 4, 1, 0, 0), -2), (datetime.datetime(2018, 4, 1, 0, 0), -8), (datetime.datetime(2018, 4, 1, 0, 0), 6), (datetime.datetime(2018, 4, 1, 0, 0), 8), (datetime.datetime(2018, 4, 1, 0, 0), 1), (datetime.datetime(2018, 4, 1, 0, 0), 22), (datetime.datetime(2018, 4, 1, 0, 0), -11), (datetime.datetime(2018, 4, 1, 0, 0), 150), (datetime.datetime(2018, 4, 1, 0, 0), -1), (datetime.datetime(2018, 4, 1, 0, 0), 5), (datetime.datetime(2018, 4, 1, 0, 0), 3), (datetime.datetime(2018, 4, 1, 0, 0), 7), (datetime.datetime(2018, 4, 1, 0, 0), 10), (datetime.datetime(2018, 4, 1, 0, 0), 32), (datetime.datetime(2018, 4, 1, 0, 0), 14), (datetime.datetime(2018, 4, 1, 0, 0), 16), (datetime.datetime(2018, 4, 1, 0, 0), 48), (datetime.datetime(2018, 4, 1, 0, 0), 12), (datetime.datetime(2018, 4, 1, 0, 0), 24), (datetime.datetime(2018, 5, 1, 0, 0), -1), (datetime.datetime(2018, 5, 1, 0, 0), 20), (datetime.datetime(2018, 5, 1, 0, 0), 16), (datetime.datetime(2018, 5, 1, 0, 0), 32), (datetime.datetime(2018, 5, 1, 0, 0), 5), (datetime.datetime(2018, 5, 1, 0, 0), 6), (datetime.datetime(2018, 5, 1, 0, 0), 120), (datetime.datetime(2018, 5, 1, 0, 0), 3), (datetime.datetime(2018, 5, 1, 0, 0), 8), (datetime.datetime(2018, 5, 1, 0, 0), -3), (datetime.datetime(2018, 5, 1, 0, 0), 36), (datetime.datetime(2018, 5, 1, 0, 0), -2), (datetime.datetime(2018, 5, 1, 0, 0), 24), (datetime.datetime(2018, 5, 1, 0, 0), 4), (datetime.datetime(2018, 5, 1, 0, 0), 1), (datetime.datetime(2018, 5, 1, 0, 0), 2), (datetime.datetime(2018, 5, 1, 0, 0), 10), (datetime.datetime(2018, 5, 1, 0, 0), -14), (datetime.datetime(2018, 5, 1, 0, 0), 14), (datetime.datetime(2018, 5, 1, 0, 0), 12), (datetime.datetime(2018, 5, 1, 0, 0), -9), (datetime.datetime(2018, 6, 1, 0, 0), 3), (datetime.datetime(2018, 6, 1, 0, 0), -1), (datetime.datetime(2018, 6, 1, 0, 0), 39), (datetime.datetime(2018, 6, 1, 0, 0), 5), (datetime.datetime(2018, 6, 1, 0, 0), 17), (datetime.datetime(2018, 6, 1, 0, 0), 11), (datetime.datetime(2018, 6, 1, 0, 0), 16), (datetime.datetime(2018, 6, 1, 0, 0), 10), (datetime.datetime(2018, 6, 1, 0, 0), 2), (datetime.datetime(2018, 6, 1, 0, 0), -4), (datetime.datetime(2018, 6, 1, 0, 0), 4), (datetime.datetime(2018, 6, 1, 0, 0), 32), (datetime.datetime(2018, 6, 1, 0, 0), 7), (datetime.datetime(2018, 6, 1, 0, 0), 120), (datetime.datetime(2018, 6, 1, 0, 0), 1), (datetime.datetime(2018, 6, 1, 0, 0), 12), (datetime.datetime(2018, 6, 1, 0, 0), -2), (datetime.datetime(2018, 6, 1, 0, 0), 6), (datetime.datetime(2018, 7, 1, 0, 0), -6), (datetime.datetime(2018, 7, 1, 0, 0), 7), (datetime.datetime(2018, 7, 1, 0, 0), 72), (datetime.datetime(2018, 7, 1, 0, 0), 6), (datetime.datetime(2018, 7, 1, 0, 0), 192), (datetime.datetime(2018, 7, 1, 0, 0), 10), (datetime.datetime(2018, 7, 1, 0, 0), 12), (datetime.datetime(2018, 7, 1, 0, 0), 32), (datetime.datetime(2018, 7, 1, 0, 0), 112), (datetime.datetime(2018, 7, 1, 0, 0), 3), (datetime.datetime(2018, 7, 1, 0, 0), -2), (datetime.datetime(2018, 7, 1, 0, 0), 5), (datetime.datetime(2018, 7, 1, 0, 0), 13), (datetime.datetime(2018, 7, 1, 0, 0), 22), (datetime.datetime(2018, 7, 1, 0, 0), -1), (datetime.datetime(2018, 7, 1, 0, 0), 1), (datetime.datetime(2018, 7, 1, 0, 0), 4), (datetime.datetime(2018, 7, 1, 0, 0), 15), (datetime.datetime(2018, 7, 1, 0, 0), 16), (datetime.datetime(2018, 7, 1, 0, 0), 8), (datetime.datetime(2018, 7, 1, 0, 0), 2), (datetime.datetime(2018, 8, 1, 0, 0), 7), (datetime.datetime(2018, 8, 1, 0, 0), 30), (datetime.datetime(2018, 8, 1, 0, 0), 20), (datetime.datetime(2018, 8, 1, 0, 0), 2), (datetime.datetime(2018, 8, 1, 0, 0), 6), (datetime.datetime(2018, 8, 1, 0, 0), 8), (datetime.datetime(2018, 8, 1, 0, 0), -3), (datetime.datetime(2018, 8, 1, 0, 0), 16), (datetime.datetime(2018, 8, 1, 0, 0), 9), (datetime.datetime(2018, 8, 1, 0, 0), 5), (datetime.datetime(2018, 8, 1, 0, 0), -2), (datetime.datetime(2018, 8, 1, 0, 0), -150), (datetime.datetime(2018, 8, 1, 0, 0), 1), (datetime.datetime(2018, 8, 1, 0, 0), -1), (datetime.datetime(2018, 8, 1, 0, 0), 11), (datetime.datetime(2018, 8, 1, 0, 0), 3), (datetime.datetime(2018, 8, 1, 0, 0), 64), (datetime.datetime(2018, 8, 1, 0, 0), 10), (datetime.datetime(2018, 8, 1, 0, 0), 12), (datetime.datetime(2018, 8, 1, 0, 0), 32), (datetime.datetime(2018, 8, 1, 0, 0), 4), (datetime.datetime(2018, 9, 1, 0, 0), 2), (datetime.datetime(2018, 9, 1, 0, 0), 40), (datetime.datetime(2018, 9, 1, 0, 0), 16), (datetime.datetime(2018, 9, 1, 0, 0), -3), (datetime.datetime(2018, 9, 1, 0, 0), 5), (datetime.datetime(2018, 9, 1, 0, 0), 4), (datetime.datetime(2018, 9, 1, 0, 0), 1), (datetime.datetime(2018, 9, 1, 0, 0), -7), (datetime.datetime(2018, 9, 1, 0, 0), 3), (datetime.datetime(2018, 9, 1, 0, 0), 6), (datetime.datetime(2018, 9, 1, 0, 0), -2), (datetime.datetime(2018, 9, 1, 0, 0), -1), (datetime.datetime(2018, 9, 1, 0, 0), 32), (datetime.datetime(2018, 10, 1, 0, 0), 2), (datetime.datetime(2018, 10, 1, 0, 0), 8), (datetime.datetime(2018, 10, 1, 0, 0), 17), (datetime.datetime(2018, 10, 1, 0, 0), 3), (datetime.datetime(2018, 10, 1, 0, 0), 5), (datetime.datetime(2018, 10, 1, 0, 0), 9), (datetime.datetime(2018, 10, 1, 0, 0), 120), (datetime.datetime(2018, 10, 1, 0, 0), -1), (datetime.datetime(2018, 10, 1, 0, 0), 6), (datetime.datetime(2018, 10, 1, 0, 0), -6), (datetime.datetime(2018, 10, 1, 0, 0), 40), (datetime.datetime(2018, 10, 1, 0, 0), 16), (datetime.datetime(2018, 10, 1, 0, 0), 20), (datetime.datetime(2018, 10, 1, 0, 0), -3), (datetime.datetime(2018, 10, 1, 0, 0), 1), (datetime.datetime(2018, 10, 1, 0, 0), 4), (datetime.datetime(2018, 10, 1, 0, 0), 32), (datetime.datetime(2018, 10, 1, 0, 0), 7), (datetime.datetime(2018, 11, 1, 0, 0), 48), (datetime.datetime(2018, 11, 1, 0, 0), 4), (datetime.datetime(2018, 11, 1, 0, 0), 16), (datetime.datetime(2018, 11, 1, 0, 0), 80), (datetime.datetime(2018, 11, 1, 0, 0), 32), (datetime.datetime(2018, 11, 1, 0, 0), 12), (datetime.datetime(2018, 11, 1, 0, 0), 10), (datetime.datetime(2018, 11, 1, 0, 0), 5), (datetime.datetime(2018, 11, 1, 0, 0), -24), (datetime.datetime(2018, 11, 1, 0, 0), 6), (datetime.datetime(2018, 11, 1, 0, 0), 72), (datetime.datetime(2018, 11, 1, 0, 0), 2), (datetime.datetime(2018, 11, 1, 0, 0), -3), (datetime.datetime(2018, 11, 1, 0, 0), 13), (datetime.datetime(2018, 11, 1, 0, 0), -12), (datetime.datetime(2018, 11, 1, 0, 0), 3), (datetime.datetime(2018, 11, 1, 0, 0), 17), (datetime.datetime(2018, 11, 1, 0, 0), -1), (datetime.datetime(2018, 11, 1, 0, 0), 1), (datetime.datetime(2018, 11, 1, 0, 0), -5), (datetime.datetime(2018, 12, 1, 0, 0), -6), (datetime.datetime(2018, 12, 1, 0, 0), 5), (datetime.datetime(2018, 12, 1, 0, 0), 3), (datetime.datetime(2018, 12, 1, 0, 0), 12), (datetime.datetime(2018, 12, 1, 0, 0), 16), (datetime.datetime(2018, 12, 1, 0, 0), 8), (datetime.datetime(2018, 12, 1, 0, 0), 4), (datetime.datetime(2018, 12, 1, 0, 0), 128), (datetime.datetime(2018, 12, 1, 0, 0), 10), (datetime.datetime(2018, 12, 1, 0, 0), 6), (datetime.datetime(2018, 12, 1, 0, 0), 2), (datetime.datetime(2018, 12, 1, 0, 0), -1), (datetime.datetime(2018, 12, 1, 0, 0), 13), (datetime.datetime(2018, 12, 1, 0, 0), 1)], [(datetime.datetime(2017, 12, 1, 0, 0), 12.72), (datetime.datetime(2017, 12, 1, 0, 0), 25.49), (datetime.datetime(2017, 12, 1, 0, 0), 20.38), (datetime.datetime(2017, 12, 1, 0, 0), 10.95), (datetime.datetime(2017, 12, 1, 0, 0), 9.95), (datetime.datetime(2017, 12, 1, 0, 0), 12.75), (datetime.datetime(2017, 12, 1, 0, 0), 8.5), (datetime.datetime(2018, 1, 1, 0, 0), 8.5), (datetime.datetime(2018, 1, 1, 0, 0), 25.49), (datetime.datetime(2018, 1, 1, 0, 0), 12.75), (datetime.datetime(2018, 1, 1, 0, 0), 24.96), (datetime.datetime(2018, 1, 1, 0, 0), 9.95), (datetime.datetime(2018, 1, 1, 0, 0), 10.95), (datetime.datetime(2018, 1, 1, 0, 0), 19.96), (datetime.datetime(2018, 1, 1, 0, 0), 0.0), (datetime.datetime(2018, 2, 1, 0, 0), 12.75), (datetime.datetime(2018, 2, 1, 0, 0), 24.96), (datetime.datetime(2018, 2, 1, 0, 0), 10.95), (datetime.datetime(2018, 2, 1, 0, 0), 8.5), (datetime.datetime(2018, 2, 1, 0, 0), 19.96), (datetime.datetime(2018, 2, 1, 0, 0), 9.95), (datetime.datetime(2018, 3, 1, 0, 0), 24.96), (datetime.datetime(2018, 3, 1, 0, 0), 9.95), (datetime.datetime(2018, 3, 1, 0, 0), 10.95), (datetime.datetime(2018, 3, 1, 0, 0), 9.86), (datetime.datetime(2018, 3, 1, 0, 0), 4.0), (datetime.datetime(2018, 3, 1, 0, 0), 12.75), (datetime.datetime(2018, 3, 1, 0, 0), 19.96), (datetime.datetime(2018, 3, 1, 0, 0), 8.5), (datetime.datetime(2018, 4, 1, 0, 0), 19.96), (datetime.datetime(2018, 4, 1, 0, 0), 8.5), (datetime.datetime(2018, 4, 1, 0, 0), 9.95), (datetime.datetime(2018, 4, 1, 0, 0), 12.75), (datetime.datetime(2018, 4, 1, 0, 0), 24.96), (datetime.datetime(2018, 4, 1, 0, 0), 10.95), (datetime.datetime(2018, 5, 1, 0, 0), 24.96), (datetime.datetime(2018, 5, 1, 0, 0), 19.96), (datetime.datetime(2018, 5, 1, 0, 0), 9.95), (datetime.datetime(2018, 5, 1, 0, 0), 12.75), (datetime.datetime(2018, 5, 1, 0, 0), 10.95), (datetime.datetime(2018, 5, 1, 0, 0), 5.0), (datetime.datetime(2018, 6, 1, 0, 0), 12.75), (datetime.datetime(2018, 6, 1, 0, 0), 4.0), (datetime.datetime(2018, 6, 1, 0, 0), 8.5), (datetime.datetime(2018, 6, 1, 0, 0), 10.95), (datetime.datetime(2018, 6, 1, 0, 0), 19.96), (datetime.datetime(2018, 6, 1, 0, 0), 9.95), (datetime.datetime(2018, 6, 1, 0, 0), 19.95), (datetime.datetime(2018, 6, 1, 0, 0), 24.96), (datetime.datetime(2018, 7, 1, 0, 0), 19.96), (datetime.datetime(2018, 7, 1, 0, 0), 8.5), (datetime.datetime(2018, 7, 1, 0, 0), 24.96), (datetime.datetime(2018, 7, 1, 0, 0), 10.95), (datetime.datetime(2018, 7, 1, 0, 0), 9.95), (datetime.datetime(2018, 7, 1, 0, 0), 12.75), (datetime.datetime(2018, 8, 1, 0, 0), 10.95), (datetime.datetime(2018, 8, 1, 0, 0), 24.96), (datetime.datetime(2018, 8, 1, 0, 0), 19.96), (datetime.datetime(2018, 8, 1, 0, 0), 9.95), (datetime.datetime(2018, 8, 1, 0, 0), 8.5), (datetime.datetime(2018, 8, 1, 0, 0), 12.75), (datetime.datetime(2018, 9, 1, 0, 0), 10.95), (datetime.datetime(2018, 9, 1, 0, 0), 24.96), (datetime.datetime(2018, 9, 1, 0, 0), 9.95), (datetime.datetime(2018, 9, 1, 0, 0), 12.75), (datetime.datetime(2018, 10, 1, 0, 0), 12.75), (datetime.datetime(2018, 10, 1, 0, 0), 24.96), (datetime.datetime(2018, 10, 1, 0, 0), 9.95), (datetime.datetime(2018, 10, 1, 0, 0), 10.95), (datetime.datetime(2018, 11, 1, 0, 0), 12.75), (datetime.datetime(2018, 11, 1, 0, 0), 32.04), (datetime.datetime(2018, 11, 1, 0, 0), 24.96), (datetime.datetime(2018, 11, 1, 0, 0), 10.95), (datetime.datetime(2018, 12, 1, 0, 0), 32.04), (datetime.datetime(2018, 12, 1, 0, 0), 12.75), (datetime.datetime(2018, 12, 1, 0, 0), 10.95), (datetime.datetime(2018, 12, 1, 0, 0), 24.96)]]

Я выполнил те же самые операции загрузки данных в панды, извлекаястолбцы каждого из двух информационных кадров и нахождение корреляции между ними.

Теперь без агрегатов я получаю значение pearson_coeff как 0.0189 и spearman_coeff как 0.0395.

Но, похоже,довольно странно для меня, что ценности на самом деле снизились так резко.Например, значение Pearson coefficient снизилось с 0.34 до 0.01, а значение Spearman coefficent снизилось с 0.28 до 0.03.

Я не уверен, почему будут такиерезкое снижение.Если мы посмотрим на график, две метрики, кажется, несколько ладят друг с другом в позитивном ключе, и я ожидал гораздо большего значения для корреляции.

Как узнать, какую из них выбратьопределить соотношение?Корреляции между aggregated метриками или взаимосвязь между non aggregated метриками?Как мне проверить, действителен ли полученный результат?

...