patience是5,但我数了只有2次怎么就退出了呢?问题还没解决

来源:2-7 实战回归模型

慕神4535282

2021-04-28

老师,上午好,请教一个问题?
本节中,我照着视频敲的代码。

model = keras.models.Sequential([
    keras.layers.Dense(30, activation='relu', 
                       input_shape=x_train.shape[1:]), # x_train.shape[1:] 即 x_train.shape[1:8]
    keras.layers.Dense(1) 
])
model.summary() 
model.compile(loss="mean_squared_error", optimizer="sgd")
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-3)]

下面是执行fit()的结果:

Epoch 1/100
363/363 [==============================] - 1s 2ms/step - loss: 3.5061 - val_loss: 0.4793
Epoch 2/100
363/363 [==============================] - 0s 877us/step - loss: 0.4930 - val_loss: 0.4187
Epoch 3/100
363/363 [==============================] - 0s 890us/step - loss: 0.3834 - val_loss: 0.4354
Epoch 4/100
363/363 [==============================] - 0s 863us/step - loss: 0.4189 - val_loss: 0.3943
Epoch 5/100
363/363 [==============================] - 0s 905us/step - loss: 0.3898 - val_loss: 0.3839
Epoch 6/100
363/363 [==============================] - 0s 872us/step - loss: 0.3588 - val_loss: 0.3770
Epoch 7/100
363/363 [==============================] - 0s 891us/step - loss: 0.3778 - val_loss: 0.4224
Epoch 8/100
363/363 [==============================] - 0s 872us/step - loss: 0.3715 - val_loss: 0.3769
Epoch 9/100
363/363 [==============================] - 0s 887us/step - loss: 0.3555 - val_loss: 0.3731
Epoch 10/100
363/363 [==============================] - 0s 921us/step - loss: 0.3442 - val_loss: 0.3687
Epoch 11/100
363/363 [==============================] - 0s 875us/step - loss: 0.3628 - val_loss: 0.3761
Epoch 12/100
363/363 [==============================] - 0s 883us/step - loss: 0.3781 - val_loss: 0.3660
Epoch 13/100
363/363 [==============================] - 0s 890us/step - loss: 0.3531 - val_loss: 0.3649
Epoch 14/100
363/363 [==============================] - 0s 945us/step - loss: 0.3629 - val_loss: 0.3672
Epoch 15/100
363/363 [==============================] - 0s 914us/step - loss: 0.3557 - val_loss: 0.3653
Epoch 16/100
363/363 [==============================] - 0s 1ms/step - loss: 0.3558 - val_loss: 0.3618
Epoch 17/100
363/363 [==============================] - 0s 905us/step - loss: 0.3498 - val_loss: 0.3611
Epoch 18/100
363/363 [==============================] - 0s 912us/step - loss: 0.3538 - val_loss: 0.3593
Epoch 19/100
363/363 [==============================] - 0s 886us/step - loss: 0.3612 - val_loss: 0.3971
Epoch 20/100
363/363 [==============================] - 0s 909us/step - loss: 0.3915 - val_loss: 0.3673
Epoch 21/100
363/363 [==============================] - 0s 901us/step - loss: 0.3549 - val_loss: 0.3646
Epoch 22/100
363/363 [==============================] - 0s 864us/step - loss: 0.3437 - val_loss: 0.3587
Epoch 23/100
363/363 [==============================] - 0s 900us/step - loss: 0.3506 - val_loss: 0.3593

我反复看了,在epoch 16/100 , 17/100 , val_loss 的差为 0.3618 - 0.3611 = 0.0007 < 1e-3 , 还有一次是
epoch 22/100, 23/100 , val_loss的差为 0.3587 - 0.3593 = -0.0006 < 1e-3,
我仔细检查了很多遍,
请问老师,
问题1: 具体是哪5次呢?(负数应该不算吧,因为我数过,负数算的话就不止5次了。)
问题2: 上面的计算结果 -0.0006 要取绝对值吗?

谢谢老师指导!

注:你回答提到的负数要考虑进去,我上面也提到过了,负数应该不算吧,因为我数过,负数算的话就不止5次了。
重要的事情说3遍

问题1: 具体是哪5次呢?(负数应该不算吧,因为我数过,负数算的话就不止5次了。)
问题2: 上面的计算结果 -0.0006 要取绝对值吗?

问题1: 具体是哪5次呢?(负数应该不算吧,因为我数过,负数算的话就不止5次了。)
问题2: 上面的计算结果 -0.0006 要取绝对值吗?

问题1: 具体是哪5次呢?(负数应该不算吧,因为我数过,负数算的话就不止5次了。)
问题2: 上面的计算结果 -0.0006 要取绝对值吗?

谢谢老师指导!

写回答

1回答

正十七

2021-04-29

同学你好,我又仔细看了下源码,这个东西的逻辑是这样的,首先,默认还是会monitor val_loss这个指标。其次,比较的时候不是和上一次比,而是和到此为止时最好的比。比如,从0.3593起,有三次能满足。

但其实我们需要的是五次,这样从你的log里还是无法推断出这个。因为最后两次比0.3593要低其实。我初步怀疑是因为这个callback获取的指标跟你打印出来的不一致所导致的,但我没继续细看代码了。代码链接在此,你可以继续研究:https://github.com/tensorflow/tensorflow/blob/85c8b2a817f95a3e979ecd1ed95bff1dc1335cff/tensorflow/python/keras/callbacks.py#L1784

再次,确实是会算负数的啊,它比较的是(current - delta)的值跟best值的比较,没有考虑绝对值的。

最后,不建议扣这个细节,要不是你提问,我还没仔细看过这块的源码呢,(也感谢你的发现), 抓大放小吧,不影响使用就可以。tf里的bug多得是。

def on_epoch_end(self, epoch, logs=None):    
    current = self.get_monitor_value(logs)    
    if current is None:    
        return    
    if self.monitor_op(current - self.min_delta, self.best):    
        self.best = current    
        self.wait = 0    
        if self.restore_best_weights:    
            self.best_weights = self.model.get_weights()    
    else:    
        self.wait += 1    
        if self.wait >= self.patience:    
            self.stopped_epoch = epoch    
            self.model.stop_training = True    
            if self.restore_best_weights:    
                if self.verbose > 0:    
                    print('Restoring model weights from the end of the best epoch.')    
                self.model.set_weights(self.best_weights)


0
1
慕神4535282
非常感谢!
2021-04-30
共1条回复

Google老师亲授 TensorFlow2.0 入门到进阶

Tensorflow2.0实战—以实战促理论的方式学习深度学习

1849 学习 · 896 问题

查看课程