Я хочу создать модель машинного обучения для аудиофайлов.Я преобразовал аудиофайлы в тензор (спектрограммы).Моя функция тензор (аудиофайлы) имеет следующую форму [119, 241, 125]
(119 файлов, 241 семплов / файл, 125 частот / семпл).По образцу я определяю образцы, которые я взял за промежуток времени, например, 16 мс.Моя форма вывода будет [119, numOptions]
.
. Я следовал этому руководству по Tensorflow.js по распознаванию звука.Они строят эту модель:
![Model](https://i.stack.imgur.com/kLsC4.png)
Я изменяю свой тензор функций на 4D: this.features = this.features.reshape([this.features.shape[0],this.features.shape[1],this.features.shape[2],1])
для 2Dconv.
buildModel() {
const inputShape1 = [this.features.shape[1], this.features.shape[2],this.features.shape[3]];
this.model = tfNode.sequential();
// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
this.model.add(tfNode.layers.conv2d(
{filters: 8, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape1}
));
// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));
// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
const inputShape2 = [119,62,8];
this.model.add(tfNode.layers.conv2d(
{filters: 32, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape2}
));
// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));
// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
const inputShape3 = [58,30,32];
this.model.add(tfNode.layers.conv2d(
{filters: 32, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape3}
));
// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));
// 1D output, => final output score of labels
this.model.add(tfNode.layers.flatten({}));
// prevents overfitting, randomly set 0
this.model.add(tfNode.layers.dropout({rate: 0.25}));
// learn anything linear, non linear comb. from conv. and soft pool
this.model.add(tfNode.layers.dense({units: 2000, activation: 'relu'}));
this.model.add(tfNode.layers.dropout({rate: 0.25}));
// give probability for each label
this.model.add(tfNode.layers.dense({units: this.labels.shape[1], activation: 'softmax'}));
this.model.summary();
// compile the model
this.model.compile({loss: 'meanSquaredError', optimizer: 'adam'});
this.model.summary()
};
Краткое описание модели:
_________________________________________________________________
Layer (type) Output shape Param #
=================================================================
conv2d_Conv2D1 (Conv2D) [null,238,124,8] 72
_________________________________________________________________
max_pooling2d_MaxPooling2D1 [null,119,62,8] 0
_________________________________________________________________
conv2d_Conv2D2 (Conv2D) [null,116,61,32] 2080
_________________________________________________________________
max_pooling2d_MaxPooling2D2 [null,58,30,32] 0
_________________________________________________________________
conv2d_Conv2D3 (Conv2D) [null,55,29,32] 8224
_________________________________________________________________
max_pooling2d_MaxPooling2D3 [null,27,14,32] 0
_________________________________________________________________
flatten_Flatten1 (Flatten) [null,12096] 0
_________________________________________________________________
dropout_Dropout1 (Dropout) [null,12096] 0
_________________________________________________________________
dense_Dense1 (Dense) [null,2000] 24194000
_________________________________________________________________
dropout_Dropout2 (Dropout) [null,2000] 0
_________________________________________________________________
dense_Dense2 (Dense) [null,2] 4002
=================================================================
Total params: 24208378
Trainable params: 24208378
Non-trainable params: 0
_________________________________________________________________
Epoch 1 / 10
eta=0.0 ======================================>----------------------------------------------------------------------------- loss=0.515 0.51476
eta=0.8 ============================================================================>--------------------------------------- loss=0.442 0.44186
eta=0.0 ===================================================================================================================>
3449ms 32236us/step - loss=0.485 val_loss=0.958
Epoch 2 / 10
eta=0.0 ======================================>----------------------------------------------------------------------------- loss=0.422 0.42188
eta=0.9 ============================================================================>--------------------------------------- loss=0.395 0.39535
eta=0.0 ===================================================================================================================>
3643ms 34043us/step - loss=0.411 val_loss=0.958
Epoch 3 / 10
1) Первый размер ввода - это тензорная форма моих функций.Два других параметра inputShapes (inputShape2, inputShape3)
определены в сообщении об ошибке, которое я получил.Как заранее определить следующие два размера ввода?