ssd模型训练被kill

来源:1-1 课程导学

关plus

2021-04-20

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0420 10:01:32.517391 139836587771712 model_lib.py:812] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 100000
I0420 10:01:32.517512 139836587771712 config_util.py:552] Maybe overwriting train_steps: 100000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0420 10:01:32.517569 139836587771712 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0420 10:01:32.517614 139836587771712 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0420 10:01:32.517659 139836587771712 config_util.py:552] Maybe overwriting eval_num_epochs: 1
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
W0420 10:01:32.517717 139836587771712 model_lib.py:828] Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu None
I0420 10:01:32.517769 139836587771712 model_lib.py:865] create_estimator_and_inputs: use_tpu False, export_to_tpu None
INFO:tensorflow:Using config: {’_model_dir’: ‘/home/gjh/下载/widerface/resnet50v1-fpn’, ‘_tf_random_seed’: None, ‘_save_summary_steps’: 100, ‘_save_checkpoints_steps’: None, ‘_save_checkpoints_secs’: 600, ‘_session_config’: allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, ‘_keep_checkpoint_max’: 5, ‘_keep_checkpoint_every_n_hours’: 10000, ‘_log_step_count_steps’: 100, ‘_train_distribute’: None, ‘_device_fn’: None, ‘_protocol’: None, ‘_eval_distribute’: None, ‘_experimental_distribute’: None, ‘_experimental_max_worker_delay_secs’: None, ‘_session_creation_timeout_secs’: 7200, ‘_service’: None, ‘_cluster_spec’: <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f2d781358d0>, ‘_task_type’: ‘worker’, ‘_task_id’: 0, ‘_global_id_in_cluster’: 0, ‘_master’: ‘’, ‘_evaluation_master’: ‘’, ‘_is_chief’: True, ‘_num_ps_replicas’: 0, ‘_num_worker_replicas’: 1}
I0420 10:01:32.518028 139836587771712 estimator.py:212] Using config: {’_model_dir’: ‘/home/gjh/下载/widerface/resnet50v1-fpn’, ‘_tf_random_seed’: None, ‘_save_summary_steps’: 100, ‘_save_checkpoints_steps’: None, ‘_save_checkpoints_secs’: 600, ‘_session_config’: allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, ‘_keep_checkpoint_max’: 5, ‘_keep_checkpoint_every_n_hours’: 10000, ‘_log_step_count_steps’: 100, ‘_train_distribute’: None, ‘_device_fn’: None, ‘_protocol’: None, ‘_eval_distribute’: None, ‘_experimental_distribute’: None, ‘_experimental_max_worker_delay_secs’: None, ‘_session_creation_timeout_secs’: 7200, ‘_service’: None, ‘_cluster_spec’: <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f2d781358d0>, ‘_task_type’: ‘worker’, ‘_task_id’: 0, ‘_global_id_in_cluster’: 0, ‘_master’: ‘’, ‘_evaluation_master’: ‘’, ‘_is_chief’: True, ‘_num_ps_replicas’: 0, ‘_num_worker_replicas’: 1}
WARNING:tensorflow:Estimator’s model_fn (<function create_model_fn..model_fn at 0x7f2d781580d0>) includes params argument, but params are not passed to Estimator.
W0420 10:01:32.518559 139836587771712 model_fn.py:630] Estimator’s model_fn (<function create_model_fn..model_fn at 0x7f2d781580d0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0420 10:01:32.519039 139836587771712 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0420 10:01:32.519214 139836587771712 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0420 10:01:32.519395 139836587771712 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0420 10:01:32.522826 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Reading unweighted datasets: [’/home/gjh/下载/widerface/TF-data/train.record’]
I0420 10:01:32.545341 139836587771712 dataset_builder.py:163] Reading unweighted datasets: [’/home/gjh/下载/widerface/TF-data/train.record’]
INFO:tensorflow:Reading record datasets for input file: [’/home/gjh/下载/widerface/TF-data/train.record’]
I0420 10:01:32.545800 139836587771712 dataset_builder.py:80] Reading record datasets for input file: [’/home/gjh/下载/widerface/TF-data/train.record’]
INFO:tensorflow:Number of filenames to read: 1
I0420 10:01:32.545873 139836587771712 dataset_builder.py:81] Number of filenames to read: 1
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0420 10:01:32.546003 139836587771712 dataset_builder.py:88] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/object_detection/builders/dataset_builder.py:105: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_determinstic.
W0420 10:01:32.549871 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/object_detection/builders/dataset_builder.py:105: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_determinstic.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/object_detection/builders/dataset_builder.py:237: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.map() W0420 10:01:32.568378 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/object_detection/builders/dataset_builder.py:237: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Usetf.data.Dataset.map()
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:110: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0420 10:01:40.669782 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:110: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:94: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
W0420 10:01:40.796485 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:94: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0420 10:01:45.189702 139836587771712 api.py:332] From /home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:282: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W0420 10:01:47.216368 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/object_detection/inputs.py:282: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Calling model_fn.
I0420 10:01:49.894197 139836587771712 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /home/gjh/.local/lib/python3.6/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use layer.__call__ method instead.
W0420 10:01:49.902148 139836587771712 deprecation.py:323] From /home/gjh/.local/lib/python3.6/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use layer.__call__ method instead.
INFO:tensorflow:Done calling model_fn.
I0420 10:01:55.426304 139836587771712 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0420 10:01:55.427150 139836587771712 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0420 10:01:56.973568 139836587771712 monitored_session.py:240] Graph was finalized.
2021-04-20 10:01:56.973781: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-04-20 10:01:56.996040: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299965000 Hz
2021-04-20 10:01:56.996359: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5ce76a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-04-20 10:01:56.996387: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-04-20 10:01:56.997875: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-04-20 10:01:57.039474: W tensorflow/compiler/xla/service/platform_util.cc:256] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_INVALID_VALUE: invalid argument
2021-04-20 10:01:57.039633: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: no supported devices found for platform CUDA
Fatal Python error: Aborted

Thread 0x00007f2d7709e700 (most recent call first):
File “/usr/lib/python3.6/threading.py”, line 295 in wait
File “/usr/lib/python3.6/queue.py”, line 164 in get
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py”, line 159 in run
File “/usr/lib/python3.6/threading.py”, line 916 in _bootstrap_inner
File “/usr/lib/python3.6/threading.py”, line 884 in _bootstrap

Current thread 0x00007f2e3e240740 (most recent call first):
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py”, line 699 in init
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/client/session.py”, line 1585 in init
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/session_manager.py”, line 194 in _restore_checkpoint
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/session_manager.py”, line 290 in prepare_session
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 647 in create_session
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 878 in create_session
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 1212 in _create_session
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 1207 in init
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 725 in init
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 1014 in init
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/training/monitored_session.py”, line 584 in MonitoredTrainingSession
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py”, line 1490 in _train_with_estimator_spec
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py”, line 1195 in _train_model_default
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py”, line 1161 in _train_model
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/estimator.py”, line 370 in train
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py”, line 714 in run_local
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py”, line 613 in run
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/training.py”, line 473 in train_and_evaluate
File “object_detection/model_main.py”, line 104 in main
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 251 in _run_main
File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 303 in run
File “/home/gjh/.local/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py”, line 40 in run
File “object_detection/model_main.py”, line 108 in
已放弃 (核心已转储)

写回答

1回答

关plus

提问者

2021-04-21

问题已解决,是显卡驱动的版本过高了

1
1
谦瑞
30系列的显卡怎么跑这个项目,求解答。
2022-12-13
共1条回复

Python3+TensorFlow打造人脸识别智能小程序

理论与实战项目双管齐下,让AI技术真正落地应用,适合毕设展示。

1086 学习 · 538 问题

查看课程