Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
214 views
in Technique[技术] by (71.8m points)

python - Tell urllib2 to use custom DNS

I'd like to tell urllib2.urlopen (or a custom opener) to use 127.0.0.1 (or ::1) to resolve addresses. I wouldn't change my /etc/resolv.conf, however.

One possible solution is to use a tool like dnspython to query addresses and httplib to build a custom url opener. I'd prefer telling urlopen to use a custom nameserver though. Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Looks like name resolution is ultimately handled by socket.create_connection.

-> urllib2.urlopen
-> httplib.HTTPConnection
-> socket.create_connection

Though once the "Host:" header has been set, you can resolve the host and pass on the IP address through down to the opener.

I'd suggest that you subclass httplib.HTTPConnection, and wrap the connect method to modify self.host before passing it to socket.create_connection.

Then subclass HTTPHandler (and HTTPSHandler) to replace the http_open method with one that passes your HTTPConnection instead of httplib's own to do_open.

Like this:

import urllib2
import httplib
import socket

def MyResolver(host):
  if host == 'news.bbc.co.uk':
    return '66.102.9.104' # Google IP
  else:
    return host

class MyHTTPConnection(httplib.HTTPConnection):
  def connect(self):
    self.sock = socket.create_connection((MyResolver(self.host),self.port),self.timeout)
class MyHTTPSConnection(httplib.HTTPSConnection):
  def connect(self):
    sock = socket.create_connection((MyResolver(self.host), self.port), self.timeout)
    self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)

class MyHTTPHandler(urllib2.HTTPHandler):
  def http_open(self,req):
    return self.do_open(MyHTTPConnection,req)

class MyHTTPSHandler(urllib2.HTTPSHandler):
  def https_open(self,req):
    return self.do_open(MyHTTPSConnection,req)

opener = urllib2.build_opener(MyHTTPHandler,MyHTTPSHandler)
urllib2.install_opener(opener)

f = urllib2.urlopen('http://news.bbc.co.uk')
data = f.read()
from lxml import etree
doc = etree.HTML(data)

>>> print doc.xpath('//title/text()')
['Google']

Obviously there are certificate issues if you use the HTTPS, and you'll need to fill out MyResolver...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...