crawlspider lagou 提示登录

来源:7-4 Rule和LinkExtractor使用

慕田峪5356548

2017-06-11

2017-06-11 14:10:42 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/1736086.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/3072260.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/3134658.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/2939897.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/3196132.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/jobs/1018006.html>

2017-06-11 14:10:43 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://passport.lagou.com/login/login.html?msg=validation&uStatus=2&clientIp=101.94.167.147> from <GET https://www.lagou.com/zhaopin/iOS/2/>


写回答

4回答

慕斯卡5268176

2017-07-21

我也遇到这样的问题

0
1
bobby
这是因为被拉勾网判断为爬虫了 , 不要爬取太快了 限制一下速度
2017-07-24
共1条回复

慕田峪5356548

提问者

2017-07-10

我把下载延迟设为10秒后就没这个问题了 看来要用代理IP 之类的技术

0
0

慕容4120926

2017-07-09

老师,我也是这样的。怎么处理?

0
1
bobby
你可以根据上面的学生一样 设置一下下载延迟
2017-07-10
共1条回复

bobby

2017-06-12

这是把你导向登录页面, 说明你登录失败了 或者访问了一些只有登录才能访问的页面 或者是你爬的过快判断你为爬虫了, 你可以降低一下爬虫速度

0
7
慕容4120926
回复
bobby
老师,我也是这样的。怎么处理?
2017-07-09
共7条回复

Scrapy打造搜索引擎 畅销4年的Python分布式爬虫课

带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎

5829 学习 · 6293 问题

查看课程