爬取没几分钟就报错了,发现知乎检测到了异常

来源:6-21 保存数据到mysql中 -3

HugoL

2019-01-08

http://img.mukewang.com/szimg/5c34a26c0001222610010675.jpg
这个时候该怎么办,要在Selenium登录的时候做if判断,做识别验证码然后填入吗,还是说有什么其他方法?

写回答

4回答

bobby

2019-01-10

可以在检测到这里的时候 自动加入代码去用云打码识别这个验证码 并自动输入后继续进行

0
2
HugoL
requests请求返回"requests.exceptions.InvalidSchema: No connection adapters were found for'img_url' "
2019-01-14
共2条回复

HugoL

提问者

2019-01-16

这个是集成到scrapy里只返回空值的//img.mukewang.com/szimg/5c3f3da30001504c10010705.jpg
这个是单独运行时可以正常识别
//img.mukewang.com/szimg/5c3f3da30001a90c10010894.jpg

0
0

bobby

2019-01-15

//img.mukewang.com/szimg/5c3d9eb20001163111520088.jpg 这个是我刚才分析的url请求,这里的src属性中保存了完整的base4编码, 你可以在这里获取到这个属性 然后截取出其中的base64数据,记住数据是从这里的逗号处开始的, 然后你拷贝出其中的逗号之后的所有字符串,可以这样直接保存成jpg文件:

import base64
image = "R0lGODdhlgA8AIcAAP7+/gICAujo6NfX1/Pz8xcXF8jIyCcnJ0dHR5eXl1dXV3d3d2ZmZoeHhzQ0NLe3t6enp7e3t6qqqjw8PMHBwUBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAAAAlgA8AEAI/wABCBxIsKDBgwgTKlzIsKHDhxAjSmxQoEGCABgDIHgAgQCAjyBBCkgwAECEAgFSqkzJgACDAAkKAJhJs6ZNAAQKBNjJQACAn0CDCh1KtKjRo0iFDlDgoECAAAoEAJhKdSqEAA0AaN0KQACAr2ABPCiQIECAAw0GKADAti2BCADiyp37AIDdu3gPBNi7VwCAv4AFLEBwQEGAw4gDKBgAoHFjAQEcBJgcoIADAJgza9YsIAEDAgBCix5NujTpBQcCqC4wAIDr17BdDwBAm3aCAgFy60ZAAIBv3wQYOCgQIACCBgYAKFdOYAGA59CfN0CAIEAABAIAaN+uvYGDAgcGAP8YT748AAMADARYv74BgPfw3x8IQB/BAAD48wsAwL+/f4AABA4kWJDgAAYQAiwsIADAQ4gQBQyA0KBAgAAHIChAAMCjRwEBAjg4MEAAAJQABjwA0NLlS5cRAswsoEAAAJw5cT4gAMDnz58CBABoYKDBggBJAxxAcCBAgAMJBhAAUNUqAAIGAGzl2tXrV7BhB0BwECDAAQcCAKxdC+HAAQkAADgIUNeBggAA9O4FMIDCgAEABA8mXNgwBAYBFAeIAMDxY8iRDQCgXLlBgAAHECxIAMDzZ88GHAQIcICBAQCpVa9m3dr1a9YEGiRwECBAgQcAABhQAIEAAODBhQ8HHiH/woMBBgYAAPBgAAEA0aVPpy5dAIIABgBs597dAADw4cWPFy8gwQMCANSvZ9/e/Xv48eXPp1/f/n38+fXv57//AcAAARgYQPBAAICEChcybOjwIcSIEidSrBgggYICBR5EECAgQQAEARAoCFBAAQEABACwBBABAMyYMAkwGPCgQICcCgDw7OlTAIQKBwoEAGD0KFKjAxwgGCDAwYQAAQoQMADgKgABAgBw7er1KwAJBgwQgIAgAIMGDRgwaFDggIMAARYAqGv3Lt68egEIaKAgQAAFAwAQLmz4MAEAihczSBDgQIACARJAAGD5MgADDAIEKABhwAAAokUnAGD69GkH/wcCBGiAAADs2LAfBKht2wEAAAQUBFjQgAAAAAwCEHdgYAKA5MqXM2dOAAD06NKnU69uYEGA7AESAOju3fsAAQAACEhwgIEBBQHWQwDg/j38+PEHCABg/759AQoKBOg/ACAAgQMFPgiAQAAAhQsZCgDw8CEBAwocBCgQAUBGjRkTAADQQIAACQsGEIggAEBKlStZtnSZskGBADMDALB5E6eAAAUC9OxJAQEAoQAEJHBwIEBSBAkiAHDqdMAAAFOpVgVwIECAAg0MAPD6FWxYrwQEADA7QIEDAxAKPFgAAG5cAAImBFgwQAAAvXsBJCAAAHBgwYMJFxYswEEAxQEKHP8YAAAy5AIJEgQIcKABAAAPAiwAIGCAAACjSZc2fRo1AAMBWDtYQABAbNmzadeWPSBAbgkAePf2/Ru4AAMAiBc3fhx5cuXLmTd3/vz4AAICAFS3bkAAAO3buXf3/l27AADjyZc3fx59evXr2bd3/x5+fPnz6ddnr4BBgQABHDRoAFBBgIEBHEQAgDChwoUMGzp8CDGixIkUAQgYACAjgAAQCgRgUEBBgQATDAAQIACAypUqISgIEKBAgJkBCAwYEEGBgwMBGCAIEGABgKFEixYdAOCBgwBMmRYIMACA1KlUq1qtOkDAggQDEAgw4CDAgQcAypo9izZtWgMTCrgNADf/boAGAgAsCIC3AYC9fAEIAAA4MOABCgIweOCgAQEEABo7fgwZAAEAlCtbBvAggOYDCSAAEDBgwYEDBAAIWLAggOoAEQC4fv2aQAQFBhQsSBBhQIICAQ4YYBBBgIAGCw4EUGAAgPLlzJs7Xy4ggQAHAQJAAIA9O3YBEgB4/w6gAYDx5CMEMPAgQYACDwZEAAA//oMDAeozIAAgv34BAPr7BwhAYIIABQMcMABA4UIBDgIsgIAgwIIFASwyACAAwEYADAJ8DHAAwQAAJU2eRGlyAACWLV2+hNmSgAAEAQo0EABA506ePA0AAArUAIMARYsWCBABwFKmAgwEgAoVwgIA/1WrDgCQVWtWAQoaFAgQQAAAsmXJPgiQNi0BAG3dujWQAIKAAwHsPgCQV2/eCAECHEAgAMBgwgQAHEacWPHixQYaBIB8gAAAypUtEwCQGUCEAQwCfAb9AMBo0qMNGAAggAAA1q0BMAAQW7ZsAwsQBAigAMBu3rsjBChQQAIA4sWNDyAAQLkAAQGcQwAQXXp0AwUKIDBAAMB27gcAfAcfXvx48gAEFAiQPoACAO3dux9AAEACBBACHBiAIEAAAP37AyTAQIEEAgAOIkRoAADDhg4BQCgQYOIBAQAuYrx4IACAjh4/AngAoEGAAAcQKBAA4UCAlgcEAIgpE4AAAwBu4v/MqXMnz543BSxgMKEAhAEEACBNqvTBAQcRFgQIcCABAwBWrU4IoDVAgwECAIAFKwAA2bJmySY4EGBtAQBu37olIAAA3bp1BwAA8AABggIBFggwoKCBBAgJCAQIoGCAAACOHzsWAGAy5cqWL2PGPOBAgM4BGAAILVqBAQAABCAIoFp1AQYGAMCGvQCAggANChwQAGA3796+eQtAEGB4gAQAjiNPrhwAAQIAngNAEGB6gwELBADIrh0AAQYBDkAAQAAA+fIAJAgAoH49+/bu37NPECBAgQIJDADIr38BggIIACYYAABCgQUCEgBYIABAQ4cPITocIABARYsXA2TUKAD/QEePHz0OADCS5MgBAVAWWGAAQEuXLgUwUMCAAQCbN20mALCTZ0+fP4H6DDC0wAMIDAQAUKp0AAACTyEkOBAgAAAAEB4IALCVa1evAigAEDuWLNkFAdBOALCWbdu1AgDElTuX7oMACQDk1buXL98DAAAHFjyYcGHDgQUMALCYcWPHjyFHlixZwAEDADBnzqwAQGfPn0GH9hxgQgQAp1GjJiDAwAAAr2HHlj2bdm3bt3Hn1q0bAQIDAIAHFz6ceHHjx5EnV76ceXPnz6FHlz6denXr17Fn107AQAAJAMCHFz+efHnz59GnV7+efXv35wUoCBCAwQABAxYEYDAAQH///wABCBxIsKDBgwgTKlzIsKHDgwISCBAQwQCAixgzZhxAgUGAjyBDEoiAoIAEAQBSqlzJEgABBBMMAJhJs6bNmzhz6tzJs+fOBwcUGBhA4UCAo0cLAFjKlKmDAAsOMCgQoCqCAFgDHIAQIICCAAUIABhLtmzZAQkWBFi7NoEAAHDjyp1Lty7dAQYEABAAoK/fv4ADCw48gEEBCQcCHBDwQEECBwEKBFDAQIEBAAAGANjMebMACA8QBAhQgAEDAwBSqx7QQAGCAgECMABAu7bt2gIKNGjgIIDvAAggPABAvLjx48iNG0jQIIGDBgkWQFAQoHqABQIAaN/Ovbt37hICBP9gwOAAggcDFgRYz379AQDw4wMgAKC+/foUAjQgUCCAA4AJHAAgWNBgQQIRCiQA0NDhQwALAgSAAOFBAgUIDiwwMADARwQHAiQQAMDkSZQnETRAACABggAIFhwIwGDAAgQNEBwooEAAAKBBhQ4lClRBAQUHAgRY4CCBAABRAQhIcCBAAgBZtQJ4AMDrVwACEjAoEMBBgwYHDABg29btW7hxFQSgG+DAAAB5AUQI0EBAAwYMAgwmPGEAAMSJIRRAMKBBgQIKHgwwMEAAAMyZAUQ4cACBAQChRY8mXZqAAQULCgQIwADAa9iwFwwAUNs2AAEAdO8m8ABCAgQBDiBgUAD/wHHkEgIEUGAAwHPoAAYAoF7dOoQDAbQHIADA+3cADwIsoBDA/PkADRIIANCewIMA8QMcUGCAAAD8+fXvBzBAAEAAAgcSLGgQwIAFDRQEYDAAAMSIECMQAGDxogAAGjc2KMAAAYQFBQZIIADgJMoBCwIEOEBgAYCYMgkAqGnTpoIAOhUkAODzp08FAQIscBDggYIGAZYyGADgKYAAUgMsMHAAANasWrdqfQDgK9iwYseCbZCgQAAECQgAaOv27VsIAObOFcAgAN68ARoA6OuXgIADAQYHQPAAAGLEAwAwbsx4QIIFAQIgEADgMmYAAhwE6Nz5AYDQAAQQELBAwQAC/xAcBGgdIAKA2LJnNzAA4Dbu2wQA8O7t+zfw3gIcKAhwAAECAMqXM19OAAB06BQGBKhe/UAACgC2c6egIAD4Ag4SQABg3jwDAOrXqx/gwEGAAAcIAKhvv74BBgcWLEgAACAAgQMHChBgIEIAhQEgAHD4EACEAA4cBCiQAEBGjQkAdPT4EWRIkAMYIAiQAIIBACtZtgTwAEBMAAIeHAhwE6cBADt5CkBwIEBQBAIAFDVqYAAApUuVJgjwNIACAFOpTm0Q4ICBAAQAdPX6lQABAAISIAhwdgAAtWvVMgjwdoIAAHPpPgBwF29evXv3Eiiw4ECAABMAFDZ8mAABAIsbOP+AsCBA5AAHAFS2fBkzZgIKAHT23BkCAggLAgQYAAB1atQTAiCAAAB2bNkAKACwDUDAgQC7FQDw/ds3AQAADBxYIABAcgAEGgBw/hx6dOnTATwIcD3AgQEAuHfnTmAAAAAPGiwokCBBAPUCALR3/x6+ewIGANS3f9+AgwMBAjgQABCAwIECEwQwACChwoUDCAxI8MDAgwILFBwIgHEAgI0cNyYQACBkSAIRCgA4iTKlypUsTw44ECBmgAIGANi8aZOAAAMGFAT4ySCCAgQCABgFICBAgQYMCAB4ClWAAABUq1qlKiCAVq0GAHj96vXBAwBky5odIICCgAUB2gZwsAD/QYC5DgYIAIA3LwADBAD4/Qs4sODBhAEYOABgQoAABwA4fgwZgIEKDRoEuKzgQQEAnDlHCABaAQEBAwCYBpCAAIDVrFuvZlAgQAAHCggAuI0bgIAGAHr79k1AAgAACgowGADhgYMAzJknCODggQAA1KtTFwAgu/bt3Lt7/y6gAQQDAQo4AIA+vXoBCAIEKHAgQIAFCAYAuH8/gP79DQYAAAhAIAADAgAcRJgQwIIADQ8sABBR4kSKEgkQAABAgIAGCgIEeJBgAACSAARAcBBAAQECAFy+BEBgAACaNW3exJkT54ADAXwGWABA6FCiBgIUCJA0gAIBCAA8fQoBQoEA/1UPJCAAQOvWBAC8fgULoEAAsgEUAECbVu1atA8AvAUgIUAABQMMTJAAQO9eAAQcBFAwQAAAwoUBCIAAQPFixo0dP248YEEAypQBXMacoIABAAAGBAAdOgAA0qUTDBCA4EEEAwBcv4YdG3YDBQFsBzAAQPdu3r0FAAAePMDwAAcgOACQXHlyAQMkBGgAYAAA6tUBCACQXft27t29cx8AgYGDAgwYAECf3kEA9gIAGDgQwIGDBREiAMCfX//+/Q8AAAQgcKBAAgwCIEQYAQDDhg4dEgAgcSKABQEuFggAAQDHjhwJGAgQ4ECBAQBOogRAgACAli5fwowp86WDADZtKv8gAGDnTgMAFgQIgIAAAAgOChQAAOABgKZOn0J1KgAA1apWASAIoDXAAQBev4IFOwAA2bJlDUwI4ECAAQBu38IFIMCAAAB279pFAGAv375+/wLuKyAA4QILGCB4AGDxYgMLEBQI4CDBAAEGEDAYAGAz586eOQsAIHo0adIBTgc4AGA169arKQCILXv27AcOHBAAoHs37969IwgAIHw48eLGjxMXICCCgwAIGCAwAGD6dAMArmPPnj1ChAEAvhMAIH48AAIEAKBPr349AAYBAjwAIH8+fQAEAODPr38//gcFAC4AMJBgQYMFBwBQuJBhQ4cPIQIQkOBAAQYKDADQuJGNY0ePHwkAEDmSZEmTAAgESACAZcuWBgYAkDmTZk2bCxgIALCTZ8+eCwAEFTqUaFGjR5EmVbqUqVEIFABElRqVwAIAV7Fm1bp1a4AADSIIICDAgAAAZ9GmVbuWbVu3b+HGlTtXwYEGAgDk1buXb1+/fwEHFjyYcGHDhxEnVryYcWPHjyFHljyZcmXLiQMCADs="
fh = open("heiqie.jpeg", "wb")
fh.write(base64.b64decode(image))
fh.close()

我这里的image是我现在请求到的字符串,你替换一下,你再打开文件看就能看到验证码了

0
2
bobby
回复
HugoL
你可以在你scrapy中调用验证码识别的逻辑中 通过断点调试的方式逐行进行代码调试 看看是否识别的代码中哪一行代码出了问题
2019-01-19
共2条回复

HugoL

提问者

2019-01-14

//img.mukewang.com/szimg/5c3c829500019c1e10010530.jpg
验证码的URL为base64编码的,无法请求

0
0

Scrapy打造搜索引擎 畅销4年的Python分布式爬虫课

带你彻底掌握Scrapy,用Django+Elasticsearch搭建搜索引擎

5795 学习 · 6290 问题

查看课程