老师,我爬取京东数据显示不出来,请帮我看看。谢谢
来源:2-9 爬取京东网的数据

在高原的阿北
2019-11-01
import requests
from lxml import html
def spider_id(sn):
""" 爬取京东商城的图书信息 """
url = 'https://search.jd.com/Search?keyword={0}'.format(sn)
#获取HTML信息
html_data = requests.get(url).text
#获取xpath对象
selector = html.fromstring(html_data)
#寻找书本列表
ul_list = selector.xpath('//div[@id="J_goodsList"]/ul/li')
print(len(ul_list))
if __name__ == '__main__':
sn = '9787115428028'
spider_id(sn)
写回答
1回答
-
NavCat
2019-11-06
京东添加了反爬机制,在请求头中添加User-Agent即可,参考代码:
import requests from lxml import html def spider_id(sn): """ 爬取京东商城的图书信息 """ url = 'https://search.jd.com/Search?keyword={0}'.format(sn) headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36' } #获取HTML信息 html_data = requests.get(url, headers=headers).text # print(html_data) #获取xpath对象 selector = html.fromstring(html_data) #寻找书本列表 ul_list = selector.xpath('//div[@id="J_goodsList"]/ul/li') print(len(ul_list)) if __name__ == '__main__': sn = '9787115428028' spider_id(sn)
272020-03-13
相似问题