Mailing List Archive

Trouble with proxies
Hello again everyone,

I have tried all the suggestions people have sent me, and I have tried all
the local debugging I could think of, but I still can't see the world from
behind my proxy server. Can anyone find a possible solution to this? I've
had to modify my URL lines with (colin-slash-slash) to get past DejaNews'
Draconian posting filters:


C:\>SET http_proxy=http(colin-slash-slash)10.187.200.230

C:\>"C:\Program Files\Python\python.exe"
Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import urllib
>>> u=urllib.urlopen('http(colin-slash-slash)www.yahoo.com')
Traceback (innermost last):
File "<stdin>", line 1, in ?
File "C:\Program Files\Python\Lib\urllib.py", line 59, in urlopen
return _urlopener.open(url)
File "C:\Program Files\Python\Lib\urllib.py", line 157, in open
return getattr(self, name)(url)
File "C:\Program Files\Python\Lib\urllib.py", line 266, in open_http
errcode, errmsg, headers = h.getreply()
File "C:\Program Files\Python\Lib\httplib.py", line 121, in getreply
line = self.file.readline()
File "C:\Program Files\Python\Lib\plat-win\socket.py", line 117, in readline
new = self._sock.recv(self._rbufsize)
IOError: [Errno socket error] (10054, 'winsock error')
>>>


I'm just out of ideas on how to solve this one. Thanks for any pointers,
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
befletch@my-dejanews.com writes:

> I have tried all the suggestions people have sent me, and I have tried all
> the local debugging I could think of, but I still can't see the world from
> behind my proxy server. Can anyone find a possible solution to this? I've
> had to modify my URL lines with (colin-slash-slash) to get past DejaNews'
> Draconian posting filters:
>
> C:\>SET http_proxy=http(colin-slash-slash)10.187.200.230
>
> C:\>"C:\Program Files\Python\python.exe"
> Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> >>> import urllib
> >>> u=urllib.urlopen('http(colin-slash-slash)www.yahoo.com')
> Traceback (innermost last):
> File "<stdin>", line 1, in ?
> File "C:\Program Files\Python\Lib\urllib.py", line 59, in urlopen
> return _urlopener.open(url)
> File "C:\Program Files\Python\Lib\urllib.py", line 157, in open
> return getattr(self, name)(url)
> File "C:\Program Files\Python\Lib\urllib.py", line 266, in open_http
> errcode, errmsg, headers = h.getreply()
> File "C:\Program Files\Python\Lib\httplib.py", line 121, in getreply
> line = self.file.readline()
> File "C:\Program Files\Python\Lib\plat-win\socket.py", line 117, in readline
> new = self._sock.recv(self._rbufsize)
> IOError: [Errno socket error] (10054, 'winsock error')
> >>>

A quick lookup in errno.errorcode shows that that error is
WSAECONNRESET, in other words the connection is reset by the server.
This apparently happens after the proxy has read your headers. Could
it be that the proxy server requires some kind of magic header? Ask
the sysadmin who is responsible for the proxy. At least find out what
the proxy software is, you can probably find the specs on the web....

If you have a way to snoop network packets, it would be interesting to
see what traffic happens when your regular browser (IE or netscape)
connects to the proxy from the same client machine (I'm assuming that
works!).

--Guido van Rossum (home page: http://www.python.org/~guido/)
Trouble with proxies [ In reply to ]
On Fri, 30 Apr 1999 befletch@my-dejanews.com wrote:

> Hello again everyone,
>
> I have tried all the suggestions people have sent me, and I have tried all
> the local debugging I could think of, but I still can't see the world from
> behind my proxy server. Can anyone find a possible solution to this? I've
> had to modify my URL lines with (colin-slash-slash) to get past DejaNews'
> Draconian posting filters:
>
>
> C:\>SET http_proxy=http(colin-slash-slash)10.187.200.230

I found I had to set the port of the proxy server (in our case 8080), so:

C:\>SET http_proxy=http://10.187.200.230:8080

or whatever

[FYI, that it's "colon", not "colin"]

--david ascher
Trouble with proxies [ In reply to ]
In article <5lzp3qafkj.fsf@eric.cnri.reston.va.us>,
Guido van Rossum <guido@eric.cnri.reston.va.us> wrote:
> A quick lookup in errno.errorcode shows that that error is
> WSAECONNRESET, in other words the connection is reset by the server.
> This apparently happens after the proxy has read your headers. Could
> it be that the proxy server requires some kind of magic header? Ask
> the sysadmin who is responsible for the proxy. At least find out what
> the proxy software is, you can probably find the specs on the web....
>
> If you have a way to snoop network packets, it would be interesting to
> see what traffic happens when your regular browser (IE or netscape)
> connects to the proxy from the same client machine (I'm assuming that
> works!).

The proxy server is WinProxy Lite, V2.1. It is running on an NT4 server.
Yes, IE and Netscape both work fine through the proxy server, and no, the
sysadmin doesn't know anything more about WinProxy than how to install it
and configure it for normal http/ftp/smtp/pop3 clients.

Following suggestions from several kind people, I have also tried the
following:

import os, urllib

os.environ['http_proxy'] = "http(colon-slash-slash)10.187.200.230"

f = urllib.urlopen('http://www.python.org')

data = f.read()

This gets as far as the f.read() call before it fails in the same fashion.
Appending ":80" or ":8080" to the proxy URL yield the same results, too.
I'm going to try to contact the WinProxy folks and see if they have any
insight into this problem, but if anyone here can help, it would of course
be appreciated.

Thanks,
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
In article <7gd0dh$add$1@nnrp1.dejanews.com>,
befletch@my-dejanews.com wrote:
> In article <5lzp3qafkj.fsf@eric.cnri.reston.va.us>,
> Guido van Rossum <guido@eric.cnri.reston.va.us> wrote:
> > A quick lookup in errno.errorcode shows that that error is
> > WSAECONNRESET, in other words the connection is reset by the server.
> > This apparently happens after the proxy has read your headers. Could
> > it be that the proxy server requires some kind of magic header? Ask
> > the sysadmin who is responsible for the proxy. At least find out what
> > the proxy software is, you can probably find the specs on the web....
> >
> > If you have a way to snoop network packets, it would be interesting to
> > see what traffic happens when your regular browser (IE or netscape)
> > connects to the proxy from the same client machine (I'm assuming that
> > works!).
>
> The proxy server is WinProxy Lite, V2.1. It is running on an NT4 server.
> Yes, IE and Netscape both work fine through the proxy server, and no, the
> sysadmin doesn't know anything more about WinProxy than how to install it
> and configure it for normal http/ftp/smtp/pop3 clients.
>
> Following suggestions from several kind people, I have also tried the
> following:

[snip]

Ok, I have expanded my test and come up with some interesting results.
I'm using IDLE now too, if that matters. Slick program. Anyway, consider
the following script:

import os, urllib

#os.environ['http_proxy'] = "http(colon-slash-slash)10.187.200.230:80/"
#os.environ['http_proxy'] = "http(colon-slash-slash)1.2.3.4:5/"
os.environ['http_proxy'] = ""
# (colon-slash-slash) means :// but DejaNews won't post without the
# translation. Dunno why not.

print os.environ['http_proxy']

#f = urllib.urlopen('http://www.python.org/')
#f = urllib.urlopen('http://www.nonexisting.site/')
f = urllib.urlopen('http://www.ibm.com/')

print f

data = f.readline()

while len(data)>0:
print data
data = f.readline()

No matter which proxy string I use, or which URL, I get the following:


<addinfourl at 9536416 whose fp = <socket._fileobject instance at 916b20>>
<HEAD><TITLE>403 Forbidden</TITLE></HEAD>

<BODY><H1>403 Forbidden</H1>

<P>The request was not properly formatted. A possible security risk
detected.</P>

Traceback (innermost last):
File "C:\PROGRA~1\PYTHON\TOOLS\IDLE\ScriptBinding.py", line 131, in
run_module_event
execfile(filename, mod.__dict__)
File "C:\Users\Bruce\postal codes\idle_experiment.py", line 19, in ?
data = f.readline()
File "C:\Program Files\Python\Lib\plat-win\socket.py", line 117, in readline
new = self._sock.recv(self._rbufsize)
error: (10054, 'winsock error')

It would appear that the proxy server isn't at issue. Who is
generating that HTML output, anyway?

Thanks,
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
Bruce Fletcher <befletch@my-dejanews.com> writes:

> No matter which proxy string I use, or which URL, I get the following:
>
> <addinfourl at 9536416 whose fp = <socket._fileobject instance at 916b20>>
> <HEAD><TITLE>403 Forbidden</TITLE></HEAD>
>
> <BODY><H1>403 Forbidden</H1>
>
> <P>The request was not properly formatted. A possible security risk
> detected.</P>
>
> Traceback (innermost last):
> File "C:\PROGRA~1\PYTHON\TOOLS\IDLE\ScriptBinding.py", line 131, in
> run_module_event
> execfile(filename, mod.__dict__)
> File "C:\Users\Bruce\postal codes\idle_experiment.py", line 19, in ?
> data = f.readline()
> File "C:\Program Files\Python\Lib\plat-win\socket.py", line 117, in readline
> new = self._sock.recv(self._rbufsize)
> error: (10054, 'winsock error')
>
> It would appear that the proxy server isn't at issue. Who is
> generating that HTML output, anyway?

My guess is that your proxy server generates that HTML! Now you have
a clue -- look it up in the server's documentation. I'm guessing that
it requires some optional HTTP header that modern browsers always send
but that poor li'l urllib.py doesn't yet know about.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Trouble with proxies [ In reply to ]
>>>>> "BF" == Bruce Fletcher <befletch@my-dejanews.com> writes:

BF> Ok, I have expanded my test and come up with some interesting
BF> results. I'm using IDLE now too, if that matters. Slick
BF> program. Anyway, consider the following script:

[...]
f = urllib.urlopen('http://www.ibm.com/')

print f

data = f.readline()
[...]

BF> No matter which proxy string I use, or which URL, I get the
BF> following:

BF> <addinfourl at 9536416 whose fp = <socket._fileobject instance
BF> at 916b20>> <HEAD><TITLE>403 Forbidden</TITLE></HEAD>

BF> <BODY><H1>403 Forbidden</H1>

BF> <P>The request was not properly formatted. A possible security
BF> risk detected.</P>

It would be helpful to see the HTTP headers as well. Can you try it
with the following:

f = urllib.urlopen('http://www.python.org/')
msg = f.info()
for hdr in msg.headers:
print hdr,
data = f.readline()

THis may shed a little more light on what specifically is the
problem. I agree with Guido, though, that the proxy server is
generating the error message.

Jeremy
Trouble with proxies [ In reply to ]
In article <14123.28729.404968.815913@bitdiddle.cnri.reston.va.us>,
jeremy@cnri.reston.va.us wrote:
> It would be helpful to see the HTTP headers as well. Can you try it
> with the following:
>
> f = urllib.urlopen('http://www.python.org/')
> msg = f.info()
> for hdr in msg.headers:
> print hdr,
> data = f.readline()
>
> THis may shed a little more light on what specifically is the
> problem. I agree with Guido, though, that the proxy server is
> generating the error message.
>
> Jeremy

Ok, here is the result with headers. They answer the 'where from'
question, but not the 'why'. Not for me at least:

Proxy-agent: Ositis-WinProxy
Connection: close
Pragma: no-cache
Cache-Control: no-cache
Content-Type: text/html
Content-Encoding: 7bit
Content-Length: 161
<HEAD><TITLE>403 Forbidden</TITLE></HEAD>

<BODY><H1>403 Forbidden</H1>

<P>The request was not properly formatted. A possible security risk
detected.</P>

Traceback (innermost last):
File "C:\PROGRA~1\PYTHON\TOOLS\IDLE\ScriptBinding.py", line 131, in
run_module_event
execfile(filename, mod.__dict__)
File "C:\Users\Bruce\postal codes\idle_experiment.py", line 21, in ?
data = f.readline()
File "C:\Program Files\Python\Lib\plat-win\socket.py", line 117, in readline
new = self._sock.recv(self._rbufsize)
error: (10054, 'winsock error')

Where do I go from here? I've tried to start a dialog with Ositis,
but I'm not very hopeful that they will be able to help. As an aside,
I am curious why the 'winsock error' is occuring, since the response from
the proxy server looks pretty full-formed. Any ideas on that part?

Thanks,
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
Bruce Fletcher wrote:
> error: (10054, 'winsock error')
>
> Where do I go from here? I've tried to start a dialog with Ositis,
> but I'm not very hopeful that they will be able to help.

I've lost the beginning of this thread, but you may wish
to make sure that urllib.py is sending a Host header with
the request.

try adding some debugging code to urllib.py's open_http;
something like:

h = httplib.HTTP(host)
h.set_debuglevel(1) # enable logging

and post the result. may give us some additional
clues.

> As an aside, I am curious why the 'winsock error' is occuring,
> since the response from the proxy server looks pretty full-
> formed.

error 10054 is "connection reset" -- in other words, the
connection was explicitly closed by the server (the fact
that the exception occurs in a readline seems to indicate
that someone's reading too far...)

</F>
Trouble with proxies [ In reply to ]
In article <055401be959b$835c8220$f29b12c2@pythonware.com>,
"Fredrik Lundh" <fredrik@pythonware.com> wrote:
> I've lost the beginning of this thread, but you may wish
> to make sure that urllib.py is sending a Host header with
> the request.
>
> try adding some debugging code to urllib.py's open_http;
> something like:
>
> h = httplib.HTTP(host)
> h.set_debuglevel(1) # enable logging
>
> and post the result. may give us some additional
> clues.

And the result is:

send: 'GET http://www.python.org/ HTTP/1.0\015\012'
send: 'Host: www.python.org\015\012'
send: 'User-agent: Python-urllib/1.10\015\012'
Traceback (innermost last):
File "C:\PROGRA~1\PYTHON\TOOLS\IDLE\ScriptBinding.py", line 131, in
run_module_event
execfile(filename, mod.__dict__)
File "C:\Users\Bruce\postal codes\idle_experiment.py", line 9, in ?
f = urllib.urlopen('http://www.python.org/')
File "C:\Program Files\Python\Lib\urllib.py", line 59, in urlopen
return _urlopener.open(url)
File "C:\Program Files\Python\Lib\urllib.py", line 157, in open
return getattr(self, name)(url)
File "C:\Program Files\Python\Lib\urllib.py", line 263, in open_http
for args in self.addheaders: apply(h.putheader, args)
File "C:\Program Files\Python\Lib\httplib.py", line 105, in putheader
self.send(str)
File "C:\Program Files\Python\Lib\httplib.py", line 84, in send
self.sock.send(str)
File "<string>", line 1, in send
IOError: [Errno socket error] (10054, 'winsock error')

Does that lead anywhere?

Thanks,
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
So here's the problem: There is a bug in the WinProxy. It will return
the 403 error you are getting whenever the initial packet from the
client does not contain a full HTTP request. I installed a copy
locally and could easily reproduce the problem; you can check yourself
by telneting directly to the proxy server and trying to type an HTTP
request -- 'GET http://www.dejanews.com/ HTTP/1.0'. As soon as you
hit the carriage return, you'll get the error.

This is definitely a problem with the proxy, and they ought to fix it.
On my machine, Netscape sends the whole request in the first packet,
so it doesn't have a problem. Python triggers the bug because it
sends each line of the request separately. (I think there was a
thread about this behavior in the newsgroup a while back, but I can't
think of the right search terms to turn it up. It is inefficient, but
not incorrect.)

You can work around the bug, if you must, by modifying httplib. There
isn't any particularly clean solution, but here's an example of an
httplib.HTTP subclass that delays sending a request until one of the
following happens: (1) it sees the '\r\n\r\n' that ends the headers,
(2) it has buffered more than 1K of data, or (3) the send is
explicitly forced.

import httplib
import string

class HTTP(httplib.HTTP):
def __init__(self, host='', port=0):
httplib.HTTP.__init__(self, host, port)
self.__buf = ''

def send(self, str, force=0):
self.__buf = self.__buf + str
if (string.find(self.__buf, '\r\n\r\n') != -1) \
or (len(self.__buf) >= 1024) \
or force:
if self.debuglevel > 0: print 'send:', `str`
self.sock.send(self.__buf)
self.__buf = ''

def endheaders(self):
self.send('\r\n', 1)


If you wire your urllib to use this http implementation, the proxy
bug should remain safely hidden.

Jeremy
Trouble with proxies [ In reply to ]
In article <14126.21006.876632.122249@bitdiddle.cnri.reston.va.us>,
jeremy@cnri.reston.va.us wrote:
> So here's the problem: There is a bug in the WinProxy. It will return
> the 403 error you are getting whenever the initial packet from the
> client does not contain a full HTTP request.

Cool. Thanks a lot for all the work you've done tracking this one down.
Definitely above and beyond the call. I have added your analysis to my
'incident report' with the WinProxy folks, so hopefully they will consider
fixing this. In the meantime, I will try to get things working with your
work-around code.

Thanks again for the help (everybody),
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Trouble with proxies [ In reply to ]
In article <14126.21006.876632.122249@bitdiddle.cnri.reston.va.us>,
jeremy@cnri.reston.va.us wrote:
> So here's the problem: There is a bug in the WinProxy. It will return
> the 403 error you are getting whenever the initial packet from the
> client does not contain a full HTTP request.

Cool. Thanks a lot for all the work you've done tracking this one down.
Definitely above and beyond the call. I have added your analysis to my
'incident report' with the WinProxy folks, so hopefully they will consider
fixing this. In the meantime, I will try to get things working with your
work-around code.

Thanks again for the help (everybody),
- Bruce

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own