Вот один способ с параметром axis
для применения его вдоль оси c -
def fill0s(data, axis):
m = data!=0
s = data.sum(axis, keepdims=True)
c = m.sum(axis, keepdims=True)
c[c==0] = 1 # to avoid warning of division by 0
return np.where(m,data,s/c)
Пример выполнения -
In [143]: data
Out[143]:
array([[0, 0, 0, 0, 3, 2, 4, 4, 0],
[4, 6, 8, 9, 3, 1, 1, 4, 0],
[6, 6, 8, 9, 3, 1, 1, 4, 0],
[0, 6, 8, 9, 3, 1, 1, 4, 0]])
In [144]: fill0s(data,axis=0)
Out[144]:
array([[5., 6., 8., 9., 3., 2., 4., 4., 0.],
[4., 6., 8., 9., 3., 1., 1., 4., 0.],
[6., 6., 8., 9., 3., 1., 1., 4., 0.],
[5., 6., 8., 9., 3., 1., 1., 4., 0.]])
In [147]: fill0s(data,axis=1)
Out[147]:
array([[3.25, 3.25, 3.25, 3.25, 3. , 2. , 4. , 4. , 3.25],
[4. , 6. , 8. , 9. , 3. , 1. , 1. , 4. , 4.5 ],
[6. , 6. , 8. , 9. , 3. , 1. , 1. , 4. , 4.75],
[4.57, 6. , 8. , 9. , 3. , 1. , 1. , 4. , 4.57]])
Временные значения для большего набора данных -
In [150]: np.random.seed(0)
In [151]: data = np.random.randint(0,10,(5000,5000))
In [152]: %timeit fill0s(data,axis=0)
161 ms ± 4.46 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [153]: %timeit fill0s(data,axis=1)
155 ms ± 6.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
#@yatu's solution
In [155]: %%timeit
...: m = data == 0
...: means = np.ma.array(data, mask = m).mean(0)
...: data + m * means.data
302 ms ± 3.03 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [156]: %%timeit
...: m = data == 0
...: means = np.ma.array(data, mask = m).mean(1)
...: data + m * means.data[:,None]
291 ms ± 2.44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)