How to write a proxy pool server (when a request comes, choose a proxy to get url content) in python? -
i not know proper name such proxy server, you're welcome fix question title.
when search proxy server on google, lot implements maproxy or a-python-proxy-in-less-than-100-lines-of-code. proxies server seems ask remote server url address.
i want build proxy server, contains proxy pool(a list of http/https proxies) , have 1 ip address , 1 port serve incoming requests. when request comes, choose proxy pool , request, , return result back.
for example have vps ip '192.168.1.66'. start proxy server @ vps ip '127.0.0.1' , port '8080'.
i can use proxy below.
import requests url = 'http://www.google.com' headers = { ... } proxies = { 'http': 'http://192.168.1.66:8080' } r = requests.get(url, headers=headers, proxies=proxies)
i have see impelement like:
from twisted.web import proxy, http twisted.internet import reactor twisted.python import log import sys log.startlogging(sys.stdout) class proxyfactory(http.httpfactory): protocol = proxy.proxy reactor.listentcp(8080, proxyfactory()) reactor.run()
it works, simple have no idea how works , how improve code use proxy pool.
an example flow :
from hidu/proxy-manager , write golang .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + client (want visit http://www.baidu.com/) + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | | via proxy 127.0.0.1:8090 | v ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + + proxy pool + + proxy manager listen ++++++++++++++++++++++++++++++++++ + on (127.0.0.1:8090) + http_proxy1,http_proxy2, + + + socks5_proxy1,socks5_proxy2 + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | | choose 1 proxy visit | www.baidu.com | v ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + site:www.baidu.com + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
your proxy pool concept not hard implement. if understand correctly, want make following.
- your proxy server listening requests on 192.168.1.66:8080
- client requests access http://www.google.com
- your proxy server sends client's request proxy server, in list of proxy server - proxy pool.
- your proxy server gets response proxy server, , respond client
so, i've write simple proxy server using flask , requests.
from flask import flask, response import random app = flask(__name__) @app.route('/p/<path:url>') def proxy(url): """ request /p/www.google.com """ url = 'http://{}'.format(url) r = get_response(url) return response(stream_with_context(r.iter_content()), content_type=r.headers['content-type']) def get_proxy(): # "proxy pool" proxies = [ 'http://proxy-server-1.com', 'http://proxy-server-2.com', 'http://proxy-server-3.com', ] return random.choice(proxies) def get_response(target_url): proxy = get_proxy(); url = "{}/p/{}".format(proxy, target_url) # above line generate http://proxy-server-1.com/p/www.google.com return requests.get(url, stream=true) if __name__ == '__main__': app.run()
then, can start here improve proxy server.
common proxy pool
, or proxy manager
can check availability, speed, , more stats of it's proxies, , select best proxy send request. , of course, example handle simple request, , can add features handle request args, methods, protocols.
hope helpful!
Comments
Post a Comment