获取到的人数的列表不是我想要的,是我正则写的不对吗
来源:13-8 正则分析获取名字和人数
已婚单身狗
2019-01-05
import re
from urllib import request
class Spider():
url = 'https://www.zhanqi.tv/games/lol’
root_pattern = '
([\s\S]?)
'name_pattern = '([\s\S]?)'
number_pattern = ‘([\s\S]*?)’
def __fetch_content(self):
r = request.urlopen(Spider.url)
htmls = r.read()
htmls = str(htmls,encoding='utf-8')
return htmls
def __analysis(self,htmls):
root_htmls = re.findall(Spider.root_pattern,htmls)
anchors = []
for html in root_htmls:
name = re.findall(Spider.name_pattern,html)
number = re.findall(Spider.number_pattern,htmls)
anchor = {'name':name,'number':number}
anchors.append(anchor)
print(anchors[0])
a = 1
# def __refine(self,anchors):
# pass
def go(self):
htmls = self.__fetch_content()
self.__analysis(htmls)
# self.__refine(anchors)
spider = Spider()
spider.go()
我想问为什么我写的number的正则表达式获取到的数据不是我想要的呢
{‘name’: [‘辰风丶’], ‘number’: [’KaTeX parse error: Expected 'EOF', got '我' at position 13: {online}', '我̲的关注', '全部直播', '…{online}’]}
这是我打印出来的列表
写回答
2回答
-
已婚单身狗
提问者
2019-01-05
打扰了,我知道问题出在哪里了
00 -
已婚单身狗
提问者
2019-01-05
url = 'https://www.zhanqi.tv/games/lol'
root_pattern = '<div class="meat">([\s\S]*?)</div>'
name_pattern = '<span class="anchor anchor-to-cut dv">([\s\S]*?)</span>'
number_pattern = '<span class="dv">([\s\S]*?)</span>'
重点是number_pattern,获取的信息不对~~~
00
相似问题