为什么在20个全连接层后再dropout,而不是每个全连接层后来一个dropout
来源:2-10 实战批归一化、激活函数、dropout
慕粉2125289011
2022-06-26
为什么在20个全连接层后再dropout,而不是每个全连接层后来一个dropout,就像BatchNormalization那样呢?
Layer (type) Output Shape Param #
flatten (Flatten) (None, 784) 0
dense (Dense) (None, 100) 78500
dense_1 (Dense) (None, 100) 10100
dense_2 (Dense) (None, 100) 10100
dense_3 (Dense) (None, 100) 10100
dense_4 (Dense) (None, 100) 10100
dense_5 (Dense) (None, 100) 10100
dense_6 (Dense) (None, 100) 10100
dense_7 (Dense) (None, 100) 10100
dense_8 (Dense) (None, 100) 10100
dense_9 (Dense) (None, 100) 10100
dense_10 (Dense) (None, 100) 10100
dense_11 (Dense) (None, 100) 10100
dense_12 (Dense) (None, 100) 10100
dense_13 (Dense) (None, 100) 10100
dense_14 (Dense) (None, 100) 10100
dense_15 (Dense) (None, 100) 10100
dense_16 (Dense) (None, 100) 10100
dense_17 (Dense) (None, 100) 10100
dense_18 (Dense) (None, 100) 10100
dense_19 (Dense) (None, 100) 10100
alpha_dropout (AlphaDropout (None, 100) 0
)
dense_20 (Dense) (None, 10) 1010
=================================================================
Total params: 271,410
Trainable params: 271,410
Non-trainable params: 0
1回答
-
可以每层都接dropout,这里只是展示了一种使用的方法。一般来说,如果dropout层较多,有可能会导致模型难训练,所以如果你每层都加dropout,dropout rate不要设的太大。
你可以试试每层都加,看看效果如何。
112022-07-10
相似问题
回答 1