老师您好,想获取一段文本,但是只能打印出来【】
来源:2-10 爬取1号店的数据

qq_逝爱终成伤_0
2021-04-17
请老师帮忙看看图片中的文本应该怎么样获取,谢谢
from lxml import html
import requests
def qidian_book():
#获取起点书城html
url = ‘https://read.qidian.com/chapter/_Tkog5c6GJI1/YS5aRA0ynV8ex0RJOkJclQ2’
kv = {‘user-agent’:‘Mozilla/5.0’}
qidian_html = requests.get(url,headers = kv).text
#获取xpath对象
qidian = html.fromstring(qidian_html)
book = qidian.xpath(’//div[@id=“j_chapterBox”]/div[@class=“text-wrap”]/div/div[@class=“read-content j_readContent”]/p’)
print(len(book))
for li in book:
novel = li.xpath(‘span/text()’)
print(novel)
if name == ‘main’:
qidian_book()
写回答
1回答
-
你需要把HTML保存到文件分析结构,这行代码要改一下:
novel = li.xpath("span/text()")
改为:
novel = li.xpath("text()")
你可以看页面的结构,p标签下并没有span标签
00
相似问题