Вот полное демо, как вы и просили.Сначала мы загружаем данные и рандомизируем их один раз, а затем берем первые 50 КБ для обучения и оставшиеся 10 КБ для целей проверки.
In [21]: import tensorflow
In [22]: import tensorflow.keras.datasets as datasets
In [23]: cifar10 = datasets.cifar10.load_data()
In [24]: (X_train, Y_train), (X_test, Y_test) = datasets.cifar10.load_data()
In [25]: X_train.shape, Y_train.shape
Out[25]: ((50000, 32, 32, 3), (50000, 1))
In [26]: X_test.shape, Y_test.shape
Out[26]: ((10000, 32, 32, 3), (10000, 1))
In [27]: import numpy as np
In [28]: X, Y = np.vstack((X_train, X_test)), np.vstack((Y_train, Y_test))
In [29]: X.shape, Y.shape
Out[29]: ((60000, 32, 32, 3), (60000, 1))
In [30]: # Shuffle only the training data along axis 0
...: def shuffle_train_data(X_train, Y_train):
...: """called after each epoch"""
...: perm = np.random.permutation(len(Y_train))
...: Xtr_shuf = X_train[perm]
...: Ytr_shuf = Y_train[perm]
...:
...: return Xtr_shuf, Ytr_shuf
In [31]: X_shuffled, Y_shuffled = shuffle_train_data(X, Y)
In [32]: (X_train_new, Y_train_new) = X_shuffled[:50000, ...], Y_shuffled[:50000, ...]
In [33]: (X_test_new, Y_test_new) = X_shuffled[50000:, ...], Y_shuffled[50000:, ...]
In [34]: X_train_new.shape, Y_train_new.shape
Out[34]: ((50000, 32, 32, 3), (50000, 1))
In [35]: X_test_new.shape, Y_test_new.shape
Out[35]: ((10000, 32, 32, 3), (10000, 1))
У нас есть функция shuffle_train_data
, которая последовательно перетасовывает данные, сохраняяпримеры и их метки в том же порядке.