拉勾网302问题

来源:7-6 item loader方式解析职位

慕粉9451062

2019-01-29

老师,我爬拉勾网的时候,加了UA和cookie,但是还是出现了302的问题。(用浏览器依然可以访问,应该是没封我的IP)
请问老师,这是为啥呀?

写回答

1回答

慕粉9451062

提问者

2019-01-30

补充console信息:

2019-01-30 09:32:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/> (referer: None)
2019-01-30 09:32:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> from <GET https://www.lagou.com/jobs/5455295.html>
2019-01-30 09:32:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> (referer: https://www.lagou.com/)
2019-01-30 09:32:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960414.html>
2019-01-30 09:33:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1>
{'crawl_time': datetime.datetime(2019, 1, 30, 9, 32, 29, 341141),
 'url': 'https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1',
 'url_object_id': 'aef6e325d316968e50cd5cc87009e0f4'}
2019-01-30 09:33:50 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2856806.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2019-01-30 09:33:52 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960511.html>
2019-01-30 09:33:54 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5357786.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2019-01-30 09:33:56 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5474115.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2019-01-30 09:33:57 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 2 pages/min), scraped 1 items (at 1 items/min)
2019-01-30 09:33:59 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5528540.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2019-01-30 09:34:02 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5197365.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2019-01-30 09:34:04 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2844860.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2019-01-30 09:34:06 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5497110.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2019-01-30 09:34:10 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3165552.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3165552.html>
2019-01-30 09:34:13 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5481459.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5481459.html>
2019-01-30 09:34:15 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5317219.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5317219.html>
2019-01-30 09:34:17 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3189780.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3189780.html>
2019-01-30 09:34:20 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3111672.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/3111672.html>
2019-01-30 09:34:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/)
2019-01-30 09:34:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/)
2019-01-30 09:34:26 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2832873.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/2832873.html>
2019-01-30 09:34:27 [scrapy.extensions.logstats] INFO: Crawled 4 pages (at 2 pages/min), scraped 1 items (at 0 items/min)
2019-01-30 09:34:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3071732.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/3071732.html>
2019-01-30 09:34:32 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5474115.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5474115.html>
2019-01-30 09:34:34 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5357786.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5357786.html>
2019-01-30 09:34:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2769395.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/2769395.html>


0
2
慕粉9451062
回复
bobby
好的 老师
2019-02-01
共2条回复

Scrapy打造搜索引擎 畅销4年的Python分布式爬虫课

带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎

5795 学习 · 6290 问题

查看课程