我在VSCODE里直接抄了13章的爬虫代码，但爬不了内容，只是出现了空的列表，求解决？？？

来源：13-10 sorted 排序

慕哥1045538

2018-07-28

import re

from urllib import request

class Spider():
      url = 'https://www.panda.tv/cate/fortnite'
      root_pattern = '<div class="video-info">([\s\S*?])</div>'
      name_pattern = '</i>([\s\S*?])</span>'
      number_pattern = '<span class="video-number">([\s\S*?])</span>'


      def __fetch_content(self):
      r = request.urlopen(Spider.url)
      htmls = r.read()
      htmls = str(htmls, encoding='utf-8')
      return htmls

      def __analysis(self, htmls):
      root_html = re.findall(Spider.root_pattern, htmls)
      anchors = []
      for html in root_html:
           name = re.findall(Spider.name_pattern,html)
           number = re.findall(Spider.number_pattern,html)
           anchor = {'name':name, 'number':number}
           anchors.append(anchor)
      print(anchors)
      return anchors

      def __refine(self,anchors):
          l = lambda anchor:{'name':anchor['name'][0].strip(),
                             'number':anchor['number'][0]
                             }
          return map(l, anchors)

      def go(self):
          htmls = self.__fetch_content()
          anchors = self.__analysis(htmls)
          anchors = list(self.__refine(anchors))
          print(anchors)


spider = Spider()
spider.go()

写回答

1回答