拉勾网302问题
来源:7-6 item loader方式解析职位
慕粉9451062
2019-01-29
老师,我爬拉勾网的时候,加了UA和cookie,但是还是出现了302的问题。(用浏览器依然可以访问,应该是没封我的IP)
请问老师,这是为啥呀?
写回答
1回答
-
慕粉9451062
提问者
2019-01-30
补充console信息:
2019-01-30 09:32:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/> (referer: None) 2019-01-30 09:32:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> from <GET https://www.lagou.com/jobs/5455295.html> 2019-01-30 09:32:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:32:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960414.html> 2019-01-30 09:33:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1> {'crawl_time': datetime.datetime(2019, 1, 30, 9, 32, 29, 341141), 'url': 'https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5455295.html&t=1548811960&_ti=1', 'url_object_id': 'aef6e325d316968e50cd5cc87009e0f4'} 2019-01-30 09:33:50 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2856806.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] 2019-01-30 09:33:52 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> from <GET https://www.lagou.com/jobs/2960511.html> 2019-01-30 09:33:54 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5357786.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] 2019-01-30 09:33:56 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5474115.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:33:57 [scrapy.extensions.logstats] INFO: Crawled 2 pages (at 2 pages/min), scraped 1 items (at 1 items/min) 2019-01-30 09:33:59 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5528540.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:02 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5197365.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:04 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/2844860.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:06 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.lagou.com/jobs/5497110.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>] 2019-01-30 09:34:10 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3165552.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3165552.html> 2019-01-30 09:34:13 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5481459.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5481459.html> 2019-01-30 09:34:15 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5317219.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/5317219.html> 2019-01-30 09:34:17 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3189780.html&t=1548812060&_ti=1> from <GET https://www.lagou.com/jobs/3189780.html> 2019-01-30 09:34:20 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3111672.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/3111672.html> 2019-01-30 09:34:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960511.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:34:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2960414.html&t=1548811961&_ti=1> (referer: https://www.lagou.com/) 2019-01-30 09:34:26 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2832873.html&t=1548812069&_ti=1> from <GET https://www.lagou.com/jobs/2832873.html> 2019-01-30 09:34:27 [scrapy.extensions.logstats] INFO: Crawled 4 pages (at 2 pages/min), scraped 1 items (at 0 items/min) 2019-01-30 09:34:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F3071732.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/3071732.html> 2019-01-30 09:34:32 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5474115.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5474115.html> 2019-01-30 09:34:34 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F5357786.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/5357786.html> 2019-01-30 09:34:36 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.lagou.com/utrack/trackMid.html?f=https%3A%2F%2Fwww.lagou.com%2Fjobs%2F2769395.html&t=1548812078&_ti=1> from <GET https://www.lagou.com/jobs/2769395.html>
022019-02-01
相似问题