Как подсчитать объединенные значения с помощью reduce ()? - PullRequest
1 голос
/ 27 мая 2020

У меня есть список из четырех фреймов данных, и я хочу объединить их все в одну таблицу на основе общего столбца ('id'), а затем подсчитать эти объединенные строки. Я думал, что смогу сделать это, используя выражения reduce() и lambda примерно так:

number_of_rows = reduce(lambda a, b: a['id'].isin(b['id']).value_counts(), list_of_dataframes)

Но, конечно, это не работает. Моя проблема в том, что я не знаю, как написать код, который добавит количество общих строк из первой пары функций value_counts () к следующей паре. Заранее благодарим!

Список фреймов данных:

list_of_df = [df_hops, df_method_mash_temp, df_ingredients_malt, whole_table]

Каждый фрейм данных:

df_hops:

|    |   hops.name     | hops.add  | hops.attribute  | hops.amount.value  | hops.amount.unit  | id  |                name                 |
+----+-----------------+-----------+-----------------+--------------------+-------------------+-----+-------------------------------------+
| 0  | Fuggles         | start     | bitter          |             25.00  | grams             |  1  | Buzz                                |
| 1  | First Gold      | start     | bitter          |             25.00  | grams             |  1  | Buzz                                |
| 2  | Fuggles         | middle    | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 3  | First Gold      | middle    | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 4  | Cascade         | end       | flavour         |             37.50  | grams             |  1  | Buzz                                |
| 5  | Amarillo        | start     | bitter          |             13.80  | grams             |  2  | Trashy Blonde                       |
| 6  | Simcoe          | start     | bitter          |             13.80  | grams             |  2  | Trashy Blonde                       |
| 7  | Amarillo        | end       | flavour         |             26.30  | grams             |  2  | Trashy Blonde                       |
| 8  | Motueka         | end       | flavour         |             18.80  | grams             |  2  | Trashy Blonde                       |
| 9  | Bramling Cross  | middle    | bitter          |             10.00  | grams             |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+-----------------+-----------+-----------------+--------------------+-------------------+-----+-------------------------------------+

df_method_mash_temp:

|    | method.mash_temp.duration  | method.mash_temp.temp.value  | method.mash_temp.temp.unit  | id  |                name                 |
+----+----------------------------+------------------------------+-----------------------------+-----+-------------------------------------+
| 0  | 75.0                       |                        64.0  | celsius                     |  1  | Buzz                                |
| 1  | NaN                        |                        69.0  | celsius                     |  2  | Trashy Blonde                       |
| 2  | 10.0                       |                        60.0  | celsius                     |  3  | Berliner Weisse With Yuzu - B-Sides |
| 3  | 30.0                       |                        65.0  | celsius                     |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+----------------------------+------------------------------+-----------------------------+-----+-------------------------------------+

df_ingredients_malt:

|    | ingredients.malt.name   | ingredients.malt.amount.value  | ingredients.malt.amount.unit  | id  |                name                 |
+----+-------------------------+--------------------------------+-------------------------------+-----+-------------------------------------+
| 0  | Maris Otter Extra Pale  |                       3.30000  | kilograms                     |  1  | Buzz                                |
| 1  | Caramalt                |                       0.20000  | kilograms                     |  1  | Buzz                                |
| 2  | Munich                  |                       0.40000  | kilograms                     |  1  | Buzz                                |
| 3  | Maris Otter Extra Pale  |                       3.25000  | kilograms                     |  2  | Trashy Blonde                       |
| 4  | Caramalt                |                       0.20000  | kilograms                     |  2  | Trashy Blonde                       |
| 5  | Munich                  |                       0.40000  | kilograms                     |  2  | Trashy Blonde                       |
| 6  | Propino Pale Malt       |                       1.63000  | kilograms                     |  3  | Berliner Weisse With Yuzu - B-Sides |
+----+-------------------------+--------------------------------+-------------------------------+-----+-------------------------------------+

полная_таблица (это основная таблица, из которой взяты остальные три выше):

|    | id  |                name                  |             tagline               | first_brewed  |                                                                                                                                                                                                                          description                                                                                                                                                                                                                           |               image_url                | abv   | ibu   | target_fg  | target_og  | ebc   |  srm   | ph   | attenuation_level  |                                                    food_pairing                                                      |                                                                                      brewers_tips                                                                                       |     contributed_by      | volume.value  | volume.unit  | boil_volume.value  | boil_volume.unit  |                                                                                                                method.mash_temp                                                                                                                  | method.fermentation.temp.value  | method.fermentation.temp.unit  |                                              method.twist                                                |                                                                                                                                                                     ingredients.malt                                                                                                                                                                       |                                                                                                                                                                                                                                                               ingredients.hops                                                                                                                                                                                                                                                                 |      ingredients.yeast      |
+----+-----+--------------------------------------+-----------------------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------+-------+-------+------------+------------+-------+--------+------+--------------------+----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------------+--------------+--------------------+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+--------------------------------+----------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+
| 0  |  1  | Buzz                                 | A Real Bitter Experience.         | 09/2007       | A light, crisp and bitter IPA brewed with English and American hops. A small batch brewed only once.                                                                                                                                                                                                                                                                                                                                                           | https://images.punkapi.com/v2/keg.png  | 4.50  | 60.0  |    1010.0  |    1044.0  | 20.0  | 10.00  | 4.4  |             75.00  | [Spicy chicken tikka masala, Grilled chicken quesadilla, Caramel toffee cake]                                        | The earthy and floral aromas from the hops can be overpowering. Drop a little Cascade in at the end of the boil to lift the profile with a bit of citrus.                               | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | [{'temp': {'value': 64, 'unit': 'celsius'}, 'duration': 75}]                                                                                                                                                                                     |                           19.0  | celsius                        | None                                                                                                     | [{'name': 'Maris Otter Extra Pale', 'amount': {'value': 3.3, 'unit': 'kilograms'}}, {'name': 'Caramalt', 'amount': {'value': 0.2, 'unit': 'kilograms'}}, {'name': 'Munich', 'amount': {'value': 0.4, 'unit': 'kilograms'}}]                                                                                                                                | [{'name': 'Fuggles', 'amount': {'value': 25, 'unit': 'grams'}, 'add': 'start', 'attribute': 'bitter'}, {'name': 'First Gold', 'amount': {'value': 25, 'unit': 'grams'}, 'add': 'start', 'attribute': 'bitter'}, {'name': 'Fuggles', 'amount': {'value': 37.5, 'unit': 'grams'}, 'add': 'middle', 'attribute': 'flavour'}, {'name': 'First Gold', 'amount': {'value': 37.5, 'unit': 'grams'}, 'add': 'middle', 'attribute': 'flavour'}, {'name': 'Cascade', 'amount': {'value': 37.5, 'unit': 'grams'}, 'add': 'end', 'attribute': 'flavour'}]  | Wyeast 1056 - American Ale™ |
| 1  |  2  | Trashy Blonde                        | You Know You Shouldn't            | 04/2008       | A titillating, neurotic, peroxide punk of a Pale Ale. Combining attitude, style, substance, and a little bit of low self esteem for good measure; what would your mother say? The seductive lure of the sassy passion fruit hop proves too much to resist. All that is even before we get onto the fact that there are no additives, preservatives, pasteurization or strings attached. All wrapped up with the customary BrewDog bite and imaginative twist.  | https://images.punkapi.com/v2/2.png    | 4.10  | 41.5  |    1010.0  |    1041.7  | 15.0  | 15.00  | 4.4  |             76.00  | [Fresh crab with lemon, Garlic butter dipping sauce, Goats cheese salad, Creamy lemon bar doused in powdered sugar]  | Be careful not to collect too much wort from the mash. Once the sugars are all washed out there are some very unpleasant grainy tasting compounds that can be extracted into the wort.  | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | [{'temp': {'value': 69, 'unit': 'celsius'}, 'duration': None}]                                                                                                                                                                                   |                           18.0  | celsius                        | None                                                                                                     | [{'name': 'Maris Otter Extra Pale', 'amount': {'value': 3.25, 'unit': 'kilograms'}}, {'name': 'Caramalt', 'amount': {'value': 0.2, 'unit': 'kilograms'}}, {'name': 'Munich', 'amount': {'value': 0.4, 'unit': 'kilograms'}}]                                                                                                                               | [{'name': 'Amarillo', 'amount': {'value': 13.8, 'unit': 'grams'}, 'add': 'start', 'attribute': 'bitter'}, {'name': 'Simcoe', 'amount': {'value': 13.8, 'unit': 'grams'}, 'add': 'start', 'attribute': 'bitter'}, {'name': 'Amarillo', 'amount': {'value': 26.3, 'unit': 'grams'}, 'add': 'end', 'attribute': 'flavour'}, {'name': 'Motueka', 'amount': {'value': 18.8, 'unit': 'grams'}, 'add': 'end', 'attribute': 'flavour'}]                                                                                                                | Wyeast 1056 - American Ale™ |
| 2  |  3  | Berliner Weisse With Yuzu - B-Sides  | Japanese Citrus Berliner Weisse.  | 11/2015       | Japanese citrus fruit intensifies the sour nature of this German classic.                                                                                                                                                                                                                                                                                                                                                                                      | https://images.punkapi.com/v2/keg.png  | 4.20  |  8.0  |    1007.0  |    1040.0  |  8.0  |  4.00  | 3.2  |             83.00  | [Smoked chicken wings, Miso ramen, Yuzu cheesecake]                                                                  | Clean everything twice. All you want is the clean sourness of lactobacillus.                                                                                                            | Sam Mason <samjbmason>  |           20  | litres       |                25  | litres            | [{'temp': {'value': 60, 'unit': 'celsius'}, 'duration': 10}, {'temp': {'value': 65, 'unit': 'celsius'}, 'duration': 30}, {'temp': {'value': 72, 'unit': 'celsius'}, 'duration': 10}, {'temp': {'value': 78, 'unit': 'celsius'}, 'duration': 5}]  |                           21.0  | celsius                        | Soured naturally using the kettle souring technique, Yuzu fruit: 50g at middle, Yuzu juice: 200ml at FV  | [{'name': 'Propino Pale Malt', 'amount': {'value': 1.63, 'unit': 'kilograms'}}, {'name': 'Wheat Malt', 'amount': {'value': 1.63, 'unit': 'kilograms'}}, {'name': 'Propino Pale Malt for kettle souring', 'amount': {'value': 0.03, 'unit': 'kilograms'}}, {'name': 'Acidulated Malt for kettle souring', 'amount': {'value': 0.03, 'unit': 'kilograms'}}]  | [{'name': 'Bramling Cross', 'amount': {'value': 10, 'unit': 'grams'}, 'add': 'middle', 'attribute': 'bitter'}]                                                                                                                                                                                                                                                                                                                                                                                                                                 | Wyeast 1056 - American Ale™ |
+----+-----+--------------------------------------+-----------------------------------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------+-------+-------+------------+------------+-------+--------+------+--------------------+----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+---------------+--------------+--------------------+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+--------------------------------+----------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------+

@ Аншул:

list_of_df = [df_hops,df_method_mash_temp,df_ingredients_malt,whole_table]

merged_count = 0

def merge_dfs(df_a, df_b):
    df = df_a.merge(df_b, on=['id', 'name'], how='outer')

    global merged_count
    merged_count += df_a.shape[0] + df_b.shape[0] - df.shape[0]
    return df

# use reduce function to get the final merged dataframe
df_final = reduce(merge_dfs, list_of_df)

print(merged_count)

Остальная часть моего кода - это запросы от API и простые нормализации, сделанные для создания таблиц, которые вы видите выше. Я не думаю, что это нужно вставлять здесь в контексте этого вопроса, но если вы хотите, чтобы я разместил это здесь, просто напишите, и я сделаю это.

1 Ответ

0 голосов
/ 27 мая 2020

Мне стало любопытно, и я захотел попробовать это. Я создал свой собственный набор примеров фреймов данных, реализовал слияние фреймов данных с помощью «уменьшения», использовал «глобальную» переменную, чтобы получить общее количество объединенных строк. Вот что я сделал, и у меня это сработало:

import pandas as pd
from functools import reduce

data1 = {
"Id": [1,2,3,4,5], 
"Data": ['a', 'b', 'c', 'd', 'e']
}

data2 = {
"Id": [2,3,4,5,6], 
"Data": ['b', 'c', 'd', 'e', 'f']
}

data3 = {
"Id": [3,4,5,6,7], 
"Data": ['c', 'd', 'e', 'f', 'g']
}

data4 = {
"Id": [4,5,6,7,8], 
"Data": ['d', 'e', 'f', 'g', 'h']
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df4 = pd.DataFrame(data4)

Dataframes:

df1:

  Data  Id
0    a   1
1    b   2
2    c   3
3    d   4
4    e   5

df2:

  Data  Id
0    b   2
1    c   3
2    d   4
3    e   5
4    f   6

df3:

  Data  Id
0    c   3
1    d   4
2    e   5
3    f   6
4    g   7

df4:

  Data  Id
0    d   4
1    e   5
2    f   6
3    g   7
4    h   8

.

# define a counter variable to be used as a global
merged_count = 0

# define the function to be called via reduce
def merge_dfs(df_a, df_b):
    df = df_a.merge(df_b, on=['Id', 'Data'], how='outer')

    global merged_count
    merged_count += df_a.shape[0] + df_b.shape[0] - df.shape[0]
    return df

# use reduce function to get the final merged dataframe
df_final = reduce(merge_dfs, [df1, df2, df3, df4])

print(df_final)
print(merged_count)

df_final:

  Data  Id
0    a   1
1    b   2
2    c   3
3    d   4
4    e   5
5    f   6
6    g   7
7    h   8

Количество объединенных строк:

12

Если это вам поможет, вы можете попробовать реализовать это в своем сценарии. Если есть лучший способ добиться этого, я действительно хотел бы знать и при необходимости улучшить это решение.

...