url请求timeout时scrapy core会araise异常?
来源:9-10 scrapy的数据收集
慕婉清4097246
2019-12-06
老师,请看如下错误信息:(异常信息显示不完整,不知为什么,请看下面回答区1楼)
“2019-12-06 20:47:13 [scrapy.core.scraper] ERROR: Error downloading
我的理解是url请求超时的时候,scrapy core不是应当根据RETRY_TIMES的值静默地重试几次就好了吗,怎么会产生一个异常呢? 即使重试次数超限还是没有得到数据也不应该出现如下异常信息呀。因为超时是个大概率发生的事情,怎么能认为是异常呢? 最多打印类似这样的信息我认为就正常:[scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://xxxx.html>
请解答,谢谢!
1回答
-
慕婉清4097246
提问者
2019-12-06
老师,请看如下错误信息:
"2019-12-06 20:47:13 [scrapy.core.scraper] ERROR: Error downloading
<GET xxxxxxx.html>
Traceback (most recent call last):
File "E:\python-in-action\Envs\guahao_36\lib\site-packages\twisted\internet\defer.py", line 1416, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "E:\python-in-action\Envs\guahao_36\lib\site-packages\twisted\python\failure.py", line 512, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "E:\python-in-action\Envs\guahao_36\lib\site-packages\scrapy\core\downloader\middleware.py", line 44, in process_request
defer.returnValue((yield download_func(request=request, spider=spider)))
File "E:\python-in-action\Envs\guahao_36\lib\site-packages\twisted\internet\defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "E:\python-in-action\Envs\guahao_36\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 352, in _cb_timeout
raise TimeoutError("Getting %s took longer than %s seconds." % (url, timeout))"
我的理解是url请求超时的时候,scrapy core不是应当根据RETRY_TIMES的值静默地重试几次就好了吗,怎么会产生一个异常呢? 即使重试次数超限还是没有得到数据也不应该出现如下异常信息呀。因为超时是个大概率发生的事情,怎么能认为是异常呢? 最多打印类似这样的信息我认为就正常:[scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://xxxx.html>
请解答,谢谢!
012019-12-09
相似问题