请问这个站的数据如何取?
来源:9-1 selenium动态网页请求与模拟登录知乎
begin_0002
2021-11-01
post地址:http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html
xhr可以看到数据
POST后出现500错误
2021-11-01 23:25:05 [scrapy.core.engine] DEBUG: Crawled (500) <POST http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html> (referer: None)
2021-11-01 23:25:05 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <500 http://www.jijiyouxuan.com/index.php?s=/index/search/goodlistnew.html>: HTTP status code is not handled or not allowed
def start_requests(self):
browser = uc.Chrome()
browser.get(“http://www.jijiyouxuan.com/”)
input(“回车继续:”)
cookie = browser.get_cookies()
cookie_dict = {}
for cook in cookie:
cookie_dict[cook[“name”]] = cook[“value”]
print(cookie_dict)
data = {
'category_id':'1221',
'brand_id': '0',
'manner_id': '0',
'material_id': '0',
'size_id': '0',
'other_id': '0',
'bed_id': '0',
'sofa_id': '0',
'chandi_id': '0',
'thickness_id': '0',
'price1': '0',
'price2': '0',
'tags': '0',
'wd': '',
'page': '1',
'order_by_field': 'default',
'order_by_type': 'asc'
}
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}
for url in self.start_urls:
yield scrapy.FormRequest(url=url, formdata=data,cookies=cookie_dict, headers=headers, callback=self.parse)
def parse(self, response):
pass
1回答
-
bobby
2021-11-03
使用 undetected chromedriver也不行?
00
相似问题