我爬取京东商品,返回的是JSON 不是html网页代码
来源:9-5 通过requests完成京东详情页数据的获取
萝卜_呀
2022-12-15
mport requests
from scrapy import Selector
import json
header={
“accept”:“text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9”,
“accept-encoding”:“gzip, deflate, br”,
“accept-language”:“zh-CN,zh;q=0.9”,
“cache-control”:“no-cache”,
“cookie”:"__jdu=165839070091428242361; areaId=12; unpl=JF8EAMZnNSttCElVBhJWHRIZHgkEWw0ITUcHZm9QAQ4NHgBWTAAYGxR7XlVdXhRLFB9vYxRUWFNPUQ4fASsSEXteXVdZDEsWC2tXVgQFDQ8VXURJQlZAFDNVCV9dSRZRZjJWBFtdT1xWSAYYRRMfDlAKDlhCR1FpMjVkXlh7VAQrAhwWGUlVVVhcCUMXBmxuBlFeWkxUNRoyHCIge1VRVlgATicCX2Y1FgkET1QGHQsaXxBMWV1cVQlNFgJnZwBXVFtOVwccAisTIEg; PCSYCityID=CN_320000_320100_0; shshshfpa=56830605-0919-8000-6b3b-3ff0f3545bc7-1671087296; shshshfpb=docil2REOEEaq9AoAYKpryQ; jsavif=1; ipLoc-djd=12-904-907-50559; __jdv=76161171|baidu-pinzhuan|t_288551095_baidupinzhuan|cpc|0f3d30c8dba7459bb52f2eb5eba8ac7d_0_a3028e718dea4a17a589ddbddebf3285|1671088853374; token=7e3cb9c92296983cea6b844a110a8b66,2,928383; __tk=SIvJSkGFVcVLVDlDTkY5VIZLTUvJSAmIVDe3VkKKTIqBiAmHiATLTY,2,928383; jsavif=1; shshshfp=e83377ee8605da4f8401de2d6feb1745; shshshsID=38ca4960d13fe4c436046f2db54f7831_7_1671089823646; __jda=122270672.165839070091428242361.1658390700.1670318982.1671087295.3; __jdb=122270672.7.165839070091428242361|3.1671087295; __jdc=122270672; ip_cityCode=904; 3AB9D23F7A4B3C9B=M77X4R5XBVYKETECF7SBDNBKA6U5D6TCPXLIFTCERMW7NX3RRSB32UG57TJLUAVVEKNRCY6BVC57CM5E6OUMZFCSBU",
“pragma”:“no-cache”,
“sec-ch-ua”:‘Google Chrome";v=“107”, “Chromium”;v=“107”, “Not=A?Brand”;v="24’,
“sec-ch-ua-mobile”:"?0",
“sec-ch-ua-platform”:“Windows”,
“sec-fetch-dest”:“document”,
“sec-fetch-mode”:“navigate”,
“sec-fetch-site”:“none”,
“sec-fetch-user”:"?1",
“upgrade-insecure-requests”:“1”,
“user-agent”:“Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36”,
}
def parse_good(id):
url_template=“https://item-soa.jd.com/getWareBusiness?skuId={}”.format(id)
html=requests.get(url_template,headers=header).text
print(html)
html=json.loads(html)
name=html[‘suit’][‘mainSkuName’]
print(name)
if name == ‘main’:
parse_good(100016376331)
1回答
-
bobby
2022-12-17
返回的内容是什么?截图我看看呢
00
相似问题