Parse PAC files in Python without using C modules

1.7k views Asked by At

I am in the sticky situation where I cannot use the pacparser library, and I was hoping someone had a pure python solution (no C/c++ modules).

I have a PAC file that has multiple proxies returned:

function FindProxyForURL(url,host)
{
if (isPlainHostName(host))
{ return "DIRECT"; }

#// Internal network HostsĀ 
if (isInNet(host, "158.232.0.0", "255.255.0.0") || isInNet(host, "127.0.0.1", "255.255.255.255")|| isInNet(host, "10.0.0.0", "255.0.0.0"))
{ return "DIRECT"; }

#// Connect through proxy server for all other hosts. If proxy server is not available, connect directly
return "PROXY proxy.site.com:3128; PROXY proxy02.site.com:3128; PROXY proxy05.site.com:3128; PROXY proxy03.site.com:3128; PROXY proxy04.site.com:3128";

}

How can I parse this using python only and what is the best way to tell which proxy is up?

Thank you and for your consideration to the academy! :)

1

There are 1 answers

0
Carson Lam On

I've created a pure-Python library called PyPAC which should do what you're looking for. It provides a subclass of requests.Session that includes honours PACs and includes PAC auto-discovery.

Internally, it uses Js2Py, a Python library that parses and executes JavaScript. PyPAC then has Python implementations of the various JavaScript functions required by the PAC specification.