最新消息:

转换cookielib模块中cookie为可用于header中的cookie字符串

Python admin 4519浏览 0评论

通常在访问需要cookie的页面时,python一般需要三个模块来解决:urllib2/urllib/cookielib模块,其中urllib2模块负责请求页面,urllib模块负责对POST数据进行编码,cookielib模块负责处理cookie信息。

cookielib模块与urllib2模块结合使用的代码一般如下:

mcj=cookielib.MozillaCookieJar()
 opener=urllib2.build_opener(urllib2.HTTPCookieProcessor(mcj))
 urllib2.install_opener(opener)

有了上面的代码后,在程序中就不用管cookie了,cookielib模块会处理好,是不是很爽。但是昨天在写程序时遇到了一个特殊需求:需要获取登录成功后的cookie值,而不仅仅是成功登录(因为另一个程序是将登录成功过后的cookie加入到header来获取登录后的内容)。

通过查阅资料发现获取登陆成功后的cookie值的有两种方法:

  1. cookielib模块中的FileCookieJar/MozillaCookieJar/LWPCooieJar对象先转换成字符串,然后处理字符串获取cookie即可。
  2. cookielib模块中的FileCookieJar/MozillaCookieJar/LWPCooieJar对象的._cookies.values()就是获取的cookie值,将其转换成标准的cookie字符串即可。

下面详细说一下过程(以189邮箱的cookie为例)

一、方法1

cookielib模块中的FileCookieJar/MozillaCookieJar/LWPCooieJar对象打印出来如下:

<cookielib.FileCookieJar[<Cookie ACCOUNT=13541295162@189.cn for .189.cn/>, <Cookie SESSION_ID=000000155790240-20131211091907124367-020 for .189.cn/>, <Cookie SSONKEY=76add719b0af7a2fc80b95bb436bfb4a0ae869f6171d2177f438366a951d3b9b60ca45e15c71143eea4c7d9f72a1d911f33c466662972fa3d97f83956627e79438911703cc2f9d09badeece1dd73ec606b85e040bb1c0d19753f22f49fbb4761505319fa67c68ca7e590582dda831d648a7d51f669902c7583f83bedf730e9fb2d49dc363122a48485dfa19af45d8f6af076d7fba9922c4dcd6e20cdeb23817ed712e89f318fe1e74128095f6d948e89e3963dbe9d15eeb10bc6112cc029c82535aa3f5c7bac0daf89b40f5270e2151b01c035816ef23428 for .189.cn/>, <Cookie VERIFY_LOGON=7e0763f479ce9f4a98cba921d38659c2 for .189.cn/>, <Cookie SSON=57effa525080063a774c0b063df3844dde36dbf3ae12389fa35a9bc0a8f4af063005f929b94898e8cfd0e7140cf0e2aec6b6c4741705b2f3df979e4387a75ed82e640250917828c0 for .e.189.cn/>, <Cookie JSESSIONID=abcwaGbTt1gd9biA98Glu for open.e.189.cn/>, <Cookie JSESSIONID=aw9_oF8LCWyb8s98Gl for webmail14.189.cn/>]>

所以,用str()函数将其转换成为字符串后,用正则表达式提取即可。

二、方法2

FileCookieJar/MozillaCookieJar/LWPCooieJar对象的._cookies.values()的内容形式如下:

[{'/': {'JSESSIONID': Cookie(version=0, name='JSESSIONID', value='aQsFGM9Ku1sdLbXSFl', port=None, port_specified=False, domain='webmail5.189.cn', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)}}, {'/': {'ACCOUNT': Cookie(version=0, name='ACCOUNT', value='13541295162@189.cn', port=None, port_specified=False, domain='.189.cn', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False), 'VERIFY_LOGON': Cookie(version=0, name='VERIFY_LOGON', value='7e0763f479ce9f4a98cba921d38659c2', port=None, port_specified=False, domain='.189.cn', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False), 'SSONKEY': Cookie(version=0, name='SSONKEY', value='76add719b0af7a2fc80b95bb436bfb4a0ae869f6171d2177f438366a951d3b9b60ca45e15c71143eea4c7d9f72a1d911f33c466662972fa3d97f83956627e79438911703cc2f9d09badeece1dd73ec606b85e040bb1c0d19753f22f49fbb4761505319fa67c68ca7e590582dda831d648a7d51f669902c7583f83bedf730e9fb2d49dc363122a48485dfa19af45d8f6af076d7fba9922c4dcd6e20cdeb23817ed712e89f318fe1e74128095f6d948e897e5cfae642cdbd9ebc235f33fcdbffd382eb2f31cd3d3117cbda44a57f5f3e4e37af9c88f13ba75c', port=None, port_specified=False, domain='.189.cn', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False), 'SESSION_ID': Cookie(version=0, name='SESSION_ID', value='000002683296688-20131211032845368117-005', port=None, port_specified=False, domain='.189.cn', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)}}, {'/': {'SSON': Cookie(version=0, name='SSON', value='57effa525080063a774c0b063df3844dde36dbf3ae12389fa35a9bc0a8f4af06936ad8a9fdb4727c64a34d285b7d0ce1bbf4814917563439df979e4387a75ed82e640250917828c0', port=None, port_specified=False, domain='.e.189.cn', domain_specified=True, domain_initial_dot=True, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)}}, {'/': {'JSESSIONID': Cookie(version=0, name='JSESSIONID', value='abc81jZBd0Ogx4epXSFlu', port=None, port_specified=False, domain='open.e.189.cn', domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)}}]

可以看出,上述内容先是一个列表,列表中的每一项是一个字典(字典的键名是cookie的路径,键值是一个字典(字典的键名是cookie名称,键值是一个cookie对象))。其中cookie对象的属性如下(可以参考http://docs.python.org/2/library/cookielib.html中【20.21.5. Cookie Objects】章节):

Cookie.version
Integer or None. Netscape cookies have version 0. RFC 2965 and RFC 2109 cookies have a version cookie-attribute of 1. However, note that cookielib may ‘downgrade’ RFC 2109 cookies to Netscape cookies, in which case version is 0.
Cookie.name
Cookie name (a string).
Cookie.value
Cookie value (a string), or None.
Cookie.port
String representing a port or a set of ports (eg. ‘80’, or ‘80,8080’), or None.
Cookie.path
Cookie path (a string, eg. '/acme/rocket_launchers').
Cookie.secure
True if cookie should only be returned over a secure connection.
Cookie.expires
Integer expiry date in seconds since epoch, or None. See also the is_expired() method.
Cookie.discard
True if this is a session cookie.
Cookie.comment
String comment from the server explaining the function of this cookie, or None.
Cookie.comment_url
URL linking to a comment from the server explaining the function of this cookie, or None.
Cookie.rfc2109
True if this cookie was received as an RFC 2109 cookie (ie. the cookie arrived in a Set-Cookie header, and the value of the Version cookie-attribute in that header was 1). This attribute is provided because cookielib may ‘downgrade’ RFC 2109 cookies to Netscape cookies, in which case version is 0.New in version 2.5.
Cookie.port_specified
True if a port or set of ports was explicitly specified by the server (in the Set-Cookie / Set-Cookie2 header).
Cookie.domain_specified
True if a domain was explicitly specified by the server.
Cookie.domain_initial_dot
True if the domain explicitly specified by the server began with a dot ('.').

Cookies may have additional non-standard cookie-attributes. These may be accessed using the following methods:

Cookie.has_nonstandard_attr(name)
Return true if cookie has the named cookie-attribute.
Cookie.get_nonstandard_attr(name, default=None)
If cookie has the named cookie-attribute, return its value. Otherwise, return default.
Cookie.set_nonstandard_attr(name, value)
Set the value of the named cookie-attribute.

The Cookie class also defines the following method:

Cookie.is_expired([now=None])
True if cookie has passed the time at which the server requested it should expire. If now is given (in seconds since the epoch), return whether the cookie has expired at the specified time.

 

所以将CookieJar对象中的cookie值转换成可加入请求头中的cookie字符串的详细代码如下:

 cookie_str=""
    for content in mcj._cookies.values():
        for path,value in content.items():
            for name,cookie in value.items():
                if cookie.domain.find('e.189.cn')==-1:
                    cookie_str=cookie_str+cookie.name+"="+cookie.value+"; "
    cookie_str=cookie_str[:-2]

通过上面代码处理后就可以将CookieJar对象中的cookie值转换成可用于请求的cookie字符串,如下:

JSESSIONID=aQsFGM9Ku1sdLbXSFl; ACCOUNT=13541295162@189.cn; VERIFY_LOGON=7e0763f479ce9f4a98cba921d38659c2; SSONKEY=76add719b0af7a2fc80b95bb436bfb4a0ae869f6171d2177f438366a951d3b9b60ca45e15c71143eea4c7d9f72a1d911f33c466662972fa3d97f83956627e79438911703cc2f9d09badeece1dd73ec606b85e040bb1c0d19753f22f49fbb4761505319fa67c68ca7e590582dda831d648a7d51f669902c7583f83bedf730e9fb2d49dc363122a48485dfa19af45d8f6af076d7fba9922c4dcd6e20cdeb23817ed712e89f318fe1e74128095f6d948e897e5cfae642cdbd9ebc235f33fcdbffd382eb2f31cd3d3117cbda44a57f5f3e4e37af9c88f13ba75c; SESSION_ID=000002683296688-20131211032845368117-005

 

———————————————————————————————————————————————–

查阅资料时,无意翻到《Python利用CookieJar自动处理Cookies》。作者写的python代码比我简便多了,是在佩服。代码如下:

for c in mcj:
    cookie += c.name + '=' + c.value + '; '
    # 输出合并后的cookie信息,跟上面的输出进行对比,看看是否是真的合并了cookie信息
    print cookie

即原来CookieJar是一个包含cookie对象的列表,所以只要简单的遍历列表,然后利用cookie对象的属性就可以简便的将CookieJar对象转换为cookie字符串。

转载请注明:jinglingshu的博客 » 转换cookielib模块中cookie为可用于header中的cookie字符串


Warning: Use of undefined constant PRC - assumed 'PRC' (this will throw an Error in a future version of PHP) in /usr/share/nginx/html/wp-content/themes/d8/comments.php on line 17
发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

网友最新评论 (2)

  1. cookielib模块可以参考http://my.oschina.net/duhaizhang/blog/69342
    admin7年前 (2013-12-11)回复
  2. 有帮助
    小小菜鸟6年前 (2015-05-27)回复