|
|
Posted on 2/13/2015 10:44:10 PM
|
|
|
|

|
0×00 Background
What are the red envelopes like? His brother's son, Huer, said, "The money is almost comparable. The brother and daughter Dao Yun said, "It's not as good as my aunt because of the wind." "Everyone understands the background, it's the New Year, and it's the day when red envelopes are flying all over the sky. It just so happened that I learned Python two days ago, and I was more excited, so I studied and studied the crawling of Weibo red envelopes, why Weibo red envelopes instead of Alipay red envelopes, because I only understand the Web, and if I have the energy, I may also study the whack-a-mole algorithm in the future. Because I am a beginner in Python, this program is also the third program I wrote after learning Python, so please don't poke in person if there is any pit in the code, the focus is on the idea, well, if there is any pit in the idea, please don't poke it in person, you see IE has the face to set itself as the default browser, I write a scum article is also acceptable...... I use Python 2.7, and it is said that there is a big difference between Python 2 and Python 3.
0×01 Ideas I was too lazy to describe it in words, so I drew a sketch, and everyone should be able to understand it.
First of all, the old rule, first introduce a library that you don't know is useful for but can't do without: [mw_shl_code=java,true]import re import urllib import urllib2 import cookielib import base64 import binascii import os import json import sys import cPickle as p import rsa[/mw_shl_code] Then declare some other variables that you will need to use later:
[mw_shl_code=java,true]reload(sys)sys.setdefaultencoding('utf-8&') #将字符编码置为utf-8luckyList=[] #红包列表lowest=10 #能忍受红包领奖记录最低为多少[/mw_shl_code]An rsa library is used here, which is not included in Python by default. Need to install it :https://pypi.python.org/pypi/rsa/
After downloading it, run setpy.py install and then we can start our development steps.
0×02 Weibo login The action of grabbing red envelopes must be carried out after logging in, so there must be a login function, login is not the key, the key is the preservation of cookies, here the cooperation of cookielib is required. [mw_shl_code=java,true]cj = cookielib. CookieJar()opener = urllib2.build_opener(urllib2. HTTPCookieProcessor(cj))urllib2.install_opener(opener)[/mw_shl_code] In this way, all network operations using opener will handle the state of cookies, although I don't know much about it, but it feels amazing. Next, we need to encapsulate two modules, one is the data acquisition module, which is used to simply GET data, and the other is used to POST data. [mw_shl_code=java,true]def getData(url) : try: req = urllib2. Request(url) result = opener.open(req) text = result.read() text=text.decode("utf-8").encode("gbk",'ignore') return text except Exception, e: print u' request exception, url: '+url print e def postData(url,data,header) : try: data = urllib.urlencode(data) req = urllib2. Request(url,data,header) result = opener.open(req) text = result.read() return text except Exception, e: print u'Request exception, url: '+url[/mw_shl_code] With these two modules, we can GET and POST data, among which the reason why getData decode and then encode is because under Win7 I always garbled the output when debugging, so I added some encoding processing, these are not the point, the login function below is the core of Weibo login. [mw_shl_code=java,true]def login(nick, pwd): print u"----------login----------" print "----------......----------" prelogin_url= 'http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=%s&rsakt=mod&checkpin=1&client=ssologin.js(v1.4.15)&_= 1400822309846' % nick preLogin = getData(prelogin_url) servertime = re.findall('"servertime":(.+?),' , preLogin)[0] pubkey = re.findall('"pubkey":"(.+?)",' , preLogin)[0] rsakv = re.findall('"rsakv":"(.+?)",' , preLogin)[0] nonce = re.findall('"nonce":"(.+?)",' , preLogin)[0] #print bytearray('xxxx','utf-8') su = base64.b64encode(urllib.quote(nick)) rsaPublickey= int(pubkey,16) key = rsa. PublicKey(rsaPublickey,65537) message = str(servertime) +'\t' + str(nonce) + '\n' + str(pwd) sp = binascii.b2a_hex(rsa.encrypt(message,key)) header = {'User-Agent' : 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)'} param = { 'entry': 'weibo', 'gateway': '1', 'from': '', 'savestate': '7', 'userticket': '1', 'ssosimplelogin': '1', 'vsnf': '1', 'vsnval': '', 'su': su, 'service': 'miniblog', 'servertime': servertime, 'nonce': nonce, 'pwencode': 'rsa2', 'sp': sp, 'encoding': 'UTF-8', 'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack', 'returntype': 'META', 'rsakv' : rsakv, } s = postData('http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.15)',param,header) try: urll = re.findall("locatio remove n.replace\(\'(.+?) \'\); " , s)[0] login=getData(urll) print u"--------- Login successful! ------- "print" ----------......---------- "except Exception, e: print u" --------- login failed! -------" print "----------......----------" exit(0)[/mw_shl_code]The parameters and encryption algorithms in this are copied from the Internet, I don't understand very well, probably it is to request a timestamp and public key first, then rsa encryption and finally process the processing and submit it to the Sina login interface, after successfully logging in from Sina, it will return a Weibo address, you need to request it, so that the login status can take effect completely, After successful login, subsequent requests will carry the current user's cookie.
0×03 Designated red envelope drawing After successfully logging in to Weibo, I can't wait to find a red envelope to try it first, of course, first in the browser. Finally, I found a page with a red envelope button, F12 summoned the debugger to see what the data packet was requesting.
You can see that the address of the request is http://huodong.weibo.com/aj_hongbao/getlucky, there are two main parameters, one is ouid, that is, the red envelope id, which can be seen in the URL, the other share parameter determines whether to share it to Weibo, and there is a _t I don't know what it is for. Okay, now theoretically, you can complete the extraction of red envelopes by submitting three parameters to this URL, but when you actually submit the parameters, you will find that the server will magically return such a string for you: [mw_shl_code=java,true] {"code":303403,"msg":"Sorry, you don't have permission to access this page","data":[]}[/mw_shl_code] Don't panic at this time, according to my many years of web development experience, the other party's programmer should judge the referer, very simple, copy all the headers of the past request. [mw_shl_code=java,true]def getLucky(id): #抽奖程序 print u"--- draw red envelope from:"+str(id)+"---" print "----------......----------" if checkValue(id)==False: #不符合条件, this is the function return later luckyUrl="http://huodong.weibo.com/aj_hongbao/getlucky" param={ 'ouid':id, 'share':0, '_t':0 } header= { 'Cache-Control':'no-cache', 'Content-Type':'application/x-www-form-urlencoded', 'Origin':'http://huodong.weibo.com', 'Pragma':'no-cache', 'Referer': 'http://huodong.weibo.com/hongbao/'+str(id), 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 BIDUBrowser/6.x Safari/537.36', 'X-Requested-With':'XMLHttpRequest' } res = postData(luckyUrl,param, header)[/mw_shl_code] In this case, there is no problem in theory, but in fact there is no problem. After the lottery action is completed, we need to judge the status, and the returned res is a json string, where the code is 100000 is successful, and if it is 90114, it is the upper limit of today's lottery, and the other values are also failed, so: [mw_shl_code=java,true]hbRes=json.loads(res)if hbRes["code"]=='901114': #今天红包已经抢完 print u"--------- has reached the upper limit---------" print "----------......----------" log('lucky', str(id)+'---'+str(hbRes["code"])+'---'+hbRes["data"]["title"]) exit(0)elif hbRes["code"]=='100000':#成功 print u"---------Wishing you prosperity---------" print "----------......----------" log('success',str(id)+'---'+res) exit(0) if hbRes["data"] and hbRes["data"]["title"]: print hbRes["data"]["title"] print "----------......----------" log('lucky', str(id)+'---'+str(hbRes["code"])+'---'+hbRes["data"]["title"])else: print u"---------Request error---------" print "----------......----------" log('lucky', str(id)+'---'+res)[/mw_shl_code], where log is also a function I customize, which is used to record logs: [mw_shl_code=java,true]def log(type,text): fp = open(type+'.txt','a') fp.write(text) fp.write('\r\n') fp.close()[/mw_shl_code]
|
Previous:Notes on PHP upload maximum limitsNext:memcache dll extension download for PHP5.2, 5.3, 5.4, 5.5
|