302跳转重定向问题

来源:7-4 Rule和LinkExtractor使用

晴天浪浪

2018-10-08

已进行了selunim模拟登录成功,同时对爬取速度进行了限制(setting中将DOWNLOAD_DELAY = 3),但是爬取时还是出现了如下302:
2018-10-08 11:01:10 [scrapy.core.engine] INFO: Spider opened
2018-10-08 11:01:10 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-10-08 11:01:10 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-10-08 11:01:12 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/> (referer: https://www.lagou.com)
2018-10-08 11:01:14 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Java/>
2018-10-08 11:01:18 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/C++/>
2018-10-08 11:01:18 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
2018-10-08 11:01:21 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/PHP/>
2018-10-08 11:01:24 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Ruby/>
2018-10-08 11:01:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Perl/>
2018-10-08 11:01:31 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/VB/>
2018-10-08 11:01:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Delphi/>
2018-10-08 11:01:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Python/>
2018-10-08 11:01:44 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/Hadoop/>
2018-10-08 11:01:48 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/.NET/>
2018-10-08 11:01:51 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/quanzhangongchengshi/>
2018-10-08 11:01:54 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/C%23/>
2018-10-08 11:01:58 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/C/>
2018-10-08 11:02:02 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/jingzhuntuijian/>
2018-10-08 11:02:05 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/sousuosuanfa/>
2018-10-08 11:02:10 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.192.170.161> from <GET https://www.lagou.com/zhaopin/CTO/>

路由器重启了也没用。。老师该怎么办。。

写回答

1回答

晴天浪浪

提问者

2018-10-08


重启路由器后clientIp也变了,但是还是302

2018-10-08 11:33:42 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/Java/>

2018-10-08 11:33:47 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/qukuailian/>

2018-10-08 11:33:47 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)

2018-10-08 11:33:51 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/C++/>

2018-10-08 11:33:54 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/PHP/>

2018-10-08 11:33:58 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/Perl/>

2018-10-08 11:34:03 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com?msg=validation&uStatus=3&clientIp=115.204.140.198> from <GET https://www.lagou.com/zhaopin/VB/>


0
3
bobby
回复
慕娘6095299
近期我会更新课程,会解决拉勾的这个问题
2018-10-18
共3条回复

Scrapy打造搜索引擎 畅销4年的Python分布式爬虫课

带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎

5796 学习 · 6290 问题

查看课程