获取到 ul_list 的 长度是30, 但是 循环 ul_list 打印出来的是空数组?
来源:2-9 爬取京东网的数据

野生前端菜鸟
2018-07-07
from lxml import html import requests def spider_jd(sn): url = 'https://search.jd.com/Search?keyword={0}'.format(sn) res = requests.get(url) res.encoding = 'utf-8' html_data = res.text selector = html.fromstring(html_data) # 找到书单的列表 ul_list = selector.xpath('//div[@id="J_goodsList"]/ul/li') print(len(ul_list)) for li in ul_list: # title title = selector.xpath('div/div[@class="p-name"]/a/@title') print(title) if __name__ == '__main__': sn = 9787115428028 spider_jd(sn)
结果是打印出来的 title 全是空数组
30
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
写回答
1回答
-
注意你的第22行
title = selector.xpath('div/div[@class="p-name"]/a/@title')
是从整个文档开始查找,记住,我们要“先抓大,再抓小”,找到了每一项,就要从每一项里面再去匹配。所以应该是从循环得到的li元素进行查找。代码如下:
title = li.xpath('div/div[@class="p-name"]/a/@title')
00
相似问题