UnknownError (see above for traceback): Failed to get convolution algorithm.
来源:5-5 Train部分代码编写

N1ghtV0yager
2021-01-08
代码和老师的一样不知道为什么运行不了
/usr/bin/python3.6 /home/chlts/PycharmProjects/cifar10_demo/train.py
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint8 = np.dtype([(“qint8”, np.int8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint8 = np.dtype([(“quint8”, np.uint8, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint16 = np.dtype([(“qint16”, np.int16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_quint16 = np.dtype([(“quint16”, np.uint16, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
_np_qint32 = np.dtype([(“qint32”, np.int32, 1)])
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / ‘(1,)type’.
np_resource = np.dtype([(“resource”, np.ubyte, 1)])
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/readcifar10.py:5: TFRecordReader.init (from tensorflow.python.ops.io_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.TFRecordDataset
.
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/readcifar10.py:11: string_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(string_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:276: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensors(tensor).repeat(num_epochs)
.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:197: QueueRunner.init (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/input.py:197: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/readcifar10.py:16: shuffle_batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.shuffle(min_after_dequeue).batch(batch_size)
.
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/train.py:70: softmax_cross_entropy (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.softmax_cross_entropy instead. Note that the order of the logits and labels arguments has been changed.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:398: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.
See tf.nn.softmax_cross_entropy_with_logits_v2
.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:399: compute_weighted_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.compute_weighted_loss instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:147: add_arg_scope..func_with_args (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.add_loss instead.
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/train.py:79: get_total_loss (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_total_loss instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:258: get_losses (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_losses instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:260: get_regularization_losses (from tensorflow.contrib.losses.python.losses.loss_ops) is deprecated and will be removed after 2016-12-30.
Instructions for updating:
Use tf.losses.get_regularization_losses instead.
2021-01-08 00:33:23.829903: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-01-08 00:33:23.915060: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-08 00:33:23.915730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.2415
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 1.62GiB
2021-01-08 00:33:23.915743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2021-01-08 00:33:24.637301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-08 00:33:24.637328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2021-01-08 00:33:24.637338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2021-01-08 00:33:24.637524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1372 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
WARNING:tensorflow:From /home/chlts/PycharmProjects/cifar10_demo/train.py:171: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
2021-01-08 00:33:28.895527: E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.1.4. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2021-01-08 00:33:28.896904: E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.1.4. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2021-01-08 00:33:28.900221: W tensorflow/core/kernels/queue_base.cc:277] _3_input_producer_1: Skipping cancelled enqueue attempt with queue not closed
2021-01-08 00:33:28.900260: W tensorflow/core/kernels/queue_base.cc:277] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
2021-01-08 00:33:28.900283: W tensorflow/core/kernels/queue_base.cc:277] _5_shuffle_batch_1/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2021-01-08 00:33:28.900308: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1334, in _do_call
return fn(*args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node Conv/Conv2D}} = Conv2D[T=DT_FLOAT, data_format=“NCHW”, dilations=[1, 1, 1, 1], padding=“SAME”, strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/Conv/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
[[{{node Mean_1/_77}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name=“edge_6804_Mean_1”, tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/home/chlts/PycharmProjects/cifar10_demo/train.py”, line 252, in
train()
File “/home/chlts/PycharmProjects/cifar10_demo/train.py”, line 207, in train
feed_dict=feed_dict)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 929, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1152, in _run
feed_dict_tensor, options, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1328, in _do_run
run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py”, line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node Conv/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py:1057) = Conv2D[T=DT_FLOAT, data_format=“NCHW”, dilations=[1, 1, 1, 1], padding=“SAME”, strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/Conv/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
[[{{node Mean_1/_77}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name=“edge_6804_Mean_1”, tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op ‘Conv/Conv2D’, defined at:
File “/home/chlts/PycharmProjects/cifar10_demo/train.py”, line 252, in
train()
File “/home/chlts/PycharmProjects/cifar10_demo/train.py”, line 136, in train
logits=resnet.model_resnet(input_data,keep_prob=keep_prob,is_training=is_training)
File “/home/chlts/PycharmProjects/cifar10_demo/resnet.py”, line 46, in model_resnet
net = slim.conv2d(net, 64, [3, 3], activation_fn=tf.nn.relu)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py”, line 182, in func_with_args
return func(*args, **current_args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py”, line 1154, in convolution2d
conv_dims=2)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py”, line 182, in func_with_args
return func(*args, **current_args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py”, line 1057, in convolution
outputs = layer.apply(inputs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 817, in apply
return self.call(inputs, *args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/layers/base.py”, line 374, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 757, in call
outputs = self.call(inputs, *args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/convolutional.py”, line 194, in call
outputs = self._convolution_op(inputs, self.kernel)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py”, line 868, in call
return self.conv_op(inp, filter)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py”, line 520, in call
return self.call(inp, filter)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py”, line 204, in call
name=self.name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_nn_ops.py”, line 957, in conv2d
data_format=data_format, dilations=dilations, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py”, line 787, in _apply_op_helper
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py”, line 488, in new_func
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 3274, in create_op
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 1770, in init
self._traceback = tf_stack.extract_stack()
UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node Conv/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/layers/python/layers/layers.py:1057) = Conv2D[T=DT_FLOAT, data_format=“NCHW”, dilations=[1, 1, 1, 1], padding=“SAME”, strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/Conv/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, Conv/weights/read)]]
[[{{node Mean_1/_77}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name=“edge_6804_Mean_1”, tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
1回答
-
E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.1.4.
CUDA和cudnn安装错了。可以先尝试吧cudnn改成7.1.4
另外,
你的显卡是name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.2415
安装的时候需要注意下,cuda的版本可能不能太高。具体支持的情况可以查一查。
可以先试一下,cuda10是否可以,tf用1.14
012021-01-08
相似问题