График цветовой кодировки для каждой строки столбца в python - PullRequest
0 голосов
/ 20 июня 2020

Я хочу построить как это , где оси x имеют значения столбцов As, Cr, Cd, Pb, а прямоугольные диаграммы имеют цветовую кодировку столбца fi sh. Является ли это возможным? мои данные (csv):

fish,As,Cr,Cd,Pb
T. ilisha,0.023,0.002,0.039,0.004
G. chapra,0.224,0.011,0.048,0.005
M. vittatus,0.678,0.015,0.236,0.106
G. giuris,0.368,0.011,0.179,0.037
C. punctatus,0.274,0.016,0.124,0.035
M. armatus,0.461,0.015,0.476,0.039
P. ticto,0.437,0.021,0.533,0.048
S. cascasia,0.301,0.009,0.068,0.011
A. mola,0.454,0.016,0.179,0.065
H. fossilis,0.423,0.023,0.423,0.117
L.bata,0.295,0.019,0.287,0.039
W. attu,0.448,0.019,0.231,0.035

1 Ответ

1 голос
/ 21 июня 2020

Вот подход к созданию коробчатых диаграмм для каждого элемента с пятном для обозначения каждого fi. sh.

pd.melt используется для создания «длинной формы» фрейма данных, что проще для seaborn работать с. Обычно создаются 2 новых столбца: один с именем элемента, а другой с соответствующим значением. Каждая исходная строка преобразуется в 4 новых строки.

from matplotlib import pyplot as plt
import pandas as pd
import seaborn as sns
from io import StringIO

df_data = StringIO('''fish,As,Cr,Cd,Pb
T. ilisha,0.023,0.002,0.039,0.004
G. chapra,0.224,0.011,0.048,0.005
M. vittatus,0.678,0.015,0.236,0.106
G. giuris,0.368,0.011,0.179,0.037
C. punctatus,0.274,0.016,0.124,0.035
M. armatus,0.461,0.015,0.476,0.039
P. ticto,0.437,0.021,0.533,0.048
S. cascasia,0.301,0.009,0.068,0.011
A. mola,0.454,0.016,0.179,0.065
H. fossilis,0.423,0.023,0.423,0.117
L.bata,0.295,0.019,0.287,0.039
W. attu,0.448,0.019,0.231,0.035''')
df = pd.read_csv(df_data)

df_long = pd.melt(df, 'fish', var_name='element', value_name='value')

sns.boxplot(x='element', y='value', palette=['lightgrey'], data=df_long, showfliers=False)
sns.scatterplot(x='element', y='value', hue='fish', palette='Set3', edgecolor='black', marker='D', data=df_long, zorder=3)
plt.show()

итоговый сюжет

PS: Чтобы избежать перекрытия маркеров scatterplot, вместо него можно использовать stripplot:

sns.stripplot(x='element', y='value', hue='fish', palette='Set3', linewidth=1, edgecolor='black', marker='D', data=df_long, zorder=3)

Для scatterplot (но не для stripplot), вы можете использовать разные маркеры для каждого fi sh:

markers = ['o', 'v', '^', '8', '*', 'P', 'D', 'X', 's', 'p', '<', '>']
sns.scatterplot(x='element', y='value', hue='fish', palette='Set3', linewidth=1, edgecolor='black', markers=markers, data=df_long, zorder=3)
...