Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
197 views
in Technique[技术] by (71.8m points)

python - Convert HTTP Proxy to HTTPS Proxy in Twisted

Recently I have been playing around with the HTTP Proxy in twisted. After much trial and error I think I finally I have something working. What I want to know though, is how, if it is possible, do I expand this proxy to also be able to handle HTTPS pages? Here is what I've got so far:

from twisted.internet import reactor
from twisted.web import http
from twisted.web.proxy import Proxy, ProxyRequest, ProxyClientFactory, ProxyClient



class HTTPProxyClient(ProxyClient):
    def handleHeader(self, key, value):
        print "%s : %s" % (key, value)
        ProxyClient.handleHeader(self, key, value)

    def handleResponsePart(self, buffer):
        print buffer
        ProxyClient.handleResponsePart(self, buffer)

class HTTPProxyFactory(ProxyClientFactory):
    protocol = HTTPProxyClient

class HTTPProxyRequest(ProxyRequest):
    protocols = {'http' : HTTPProxyFactory}

    def process(self):
        print self.method
        for k,v in self.requestHeaders.getAllRawHeaders():
            print "%s : %s" % (k,v)
        print "
 
"

        ProxyRequest.process(self)

class HTTPProxy(Proxy):

    requestFactory = HTTPProxyRequest


factory = http.HTTPFactory()
factory.protocol = HTTPProxy

reactor.listenSSL(8001, factory)
reactor.run()

As this code demonstrates, for the sake of example for now I am just printing out whatever is going through the connection. Is it possible to handle HTTPS with the same classes? If not, how should I go about implementing such a thing?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you want to connect to an HTTPS website via an HTTP proxy, you need to use the CONNECT HTTP verb (because that's how a proxy works for HTTPS). In this case, the proxy server simply connects to the target server and relays whatever is sent by the server back to the client's socket (and vice versa). There's no caching involved in this case (but you might be able to log the hosts you're connecting to).

The exchange will look like this (client to proxy):

C->P: CONNECT target.host:443 HTTP/1.0
C->P:

P->C: 200 OK
P->C: 

After this, the proxy simply opens a plain socket to the target server (no HTTP or SSL/TLS yet) and relays everything between the initial client and the target server (including the TLS handshake that the client initiates). The client upgrades the existing socket it has to the proxy to use TLS/SSL (by starting the SSL/TLS handshake). Once the client has read the '200' status line, as far as the client is concerned, it's as if it had made the connection to the target server directly.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...