老师使用你参考的代码运行不成功是什么原因呢

来源：2-11 爬取淘宝网的数据

Samuel10

2018-11-06

图片描述

写回答

3回答

NavCat

2018-11-08

已采纳

使用如下步骤试试：

浏览器登录淘宝
按下图找到cookie
配置请求头信息

import requests
import re
import json
 
def spider_tb(sn ,book_list=[]):
    url = 'https://s.taobao.com/search?q={0}'.format(sn)
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
        'cookie': '你的cookie'
    }
    # 获取html内容
    text = requests.get(url, headers=headers).text
 
    # 使用正则表达式找到json对象
    p = re.compile(r'g_page_config = (\{.+\});\s*', re.M)
    rest = p.search(text)
    if rest:
        print(rest.group(1))
        data = json.loads(rest.group(1))
        bk_list = data['mods']['itemlist']['data']['auctions']
 
        print (len (bk_list))
        for bk in bk_list:
            #标题
            title = bk["raw_title"]
            print(title)
            #价格
            price = bk["view_price"]
            print(price)
            #购买链接
            link = bk["detail_url"]
            print(link)
            #商家
            store = bk["nick"]
            print(store)
            book_list.append({ 'title' : title, 'price' : price, 'link' : link, 'store' : store })
            print ('{title}:{price}:{link}:{store}'.format( title = title, price = price, link = link, store = store )) 
 
 
 
if __name__ == '__main__':
    spider_tb('9787115428028')

查看结果

//img.mukewang.com/szimg/5be44d400001bc8818880924.jpg

weixin_慕工程3261254

老师运行你最新的代码不行 Traceback (most recent call last): File "C:/PYTOOLS/static/spider_taobao.py", line 42, in spider_tb('9787115428028') File "C:/PYTOOLS/static/spider_taobao.py", line 13, in spider_tb text = requests.get(url, headers=headers).text File "C:\PYTOOLS\venv\lib\site-packages\requests\api.py", line 75, in get return request('get', url, params=params, **kwargs) File "C:\PYTOOLS\venv\lib\site-packages\requests\api.py", line 60, in request return session.request(method=method, url=url, **kwargs) File "C:\PYTOOLS\venv\lib\site-packages\requests\sessions.py", line 533, in request resp = self.send(prep, **send_kwargs)

2018-12-02

共1条回复