Mailing List Archive

Strange behavior with directives ProxyRemote and NoProxy
Hi there,

most combinations of directives ProxyRemote and NoProxy seem not to work
correctly in my setup. As I couldn't find anything meaningful on the
Internet, I'm asking this list.

My setup is as follows: (quite complex but typical)

Ubuntu Server 22.04 LTS
Apache httpd 2.4.52 (not the latest but didn't find a bug/fix in recent
change logs)

Apache httpd is (among other things) also used as a proxy for requests
to the Internet (to make some external sites appear being served from
our application's host to work around some XSS/CORS issues). This is
done with some simple RewriteRules, e.g.

RewriteRule "/proxy/external/foo/0815/" "https://foo.com/svc/0815/" [P]

The server is running in intranet 10.0.0.0. All requests to the Internet
have to go through the company's proxy server 10.5.10.20:8080.

Additionally, the httpd must also proxy a local/intranet service that is
running on host 10.5.20.100. Requests to this host MUST NOT go though
the company's proxy, which ONLY serves external/Internet sites.

Important(?) side note: through DNS the server can only resolve
local/intranet names and addresses. The DNS refuses to resolve
external/Internet names and addresses.

According to the docs, configuring ProxyRemote and NoProxy should be
quite simple:

# All requests go through the company's proxy
ProxyRemote "*" "http://10.5.10.20:8080"

# Direct requests to all intranet hosts
NoProxy ".mycompany.local" "10.0.0.0/8"

This configuration works for both Apache Tomcat as well as for e.g. curl
and wget (though http(s)_proxy and no_proxy environment variables).

However, this does not work with Apache httpd. It either doesn't use the
remote proxy at all or sends all requests to the remote proxy.

It seems like NoProxy doesn't work exactly as described in the docs.

If I add the local domain ".mycompany.local" and/or the whole local
subnet "10.0.0.0/8" to NoProxy, the remote proxy is actually never used.
Logs show that in this case Apache httpd tries to directly connect to
the external URL and gives up after a certain time and responds with a
503 Service Unavailable status.

Why is the remote proxy not used here? Is it, because the remote proxy
is located in the same domain and subnet 10.0.0.0/8?

The remote proxy isn't used when I set NoProxy to just "10.5.0.0/16".
One (weird) explanation is that the remote proxy is in the 10.5.0.0
subnet as well. However, typically, the decision of when to use the
remote proxy should not depend the remote proxy's address (but only of
the requested address).

When leaving NoProxy empty, the remote proxy is used and proxying
external services works properly.

There's still the intranet service on host 10.5.20.100 to be reverse
proxied as well. I must at least exclude requests to this host from
being sent to the remote proxy. Setting

NoProxy "10.5.20.0/24" (or "10.5.20", "10.5.20.0")

seems being ignored by httpd, so all requests, including those to
10.5.20.100, are still sent to the remote proxy.

Setting NoProxy to the IP address of the internal service
("10.5.20.100") or to it's hostname ("myintlservice.mycompany.local") is
also ignored. All requests still get forwarded to the remote proxy.

Even with LogLevel proxy:trace5 there are no lines logged that say
anything about the decision of using the configured remote proxy or not.
So, I was left to try and error (for several days).

The documentation is quite clear about NoProxy. However, from my point
of view the NoProxy feature seems not to work properly at all.

I'm I missing something? Since my C/C++ skills are just below
intermediate (and httpd source code is quite "compact"), I'm not able to
help myself by reading the sources or even to spot any bugs there (if any).

My current workaround is to use ProxyRemoteMatch with an expression that
does NOT match any intranet sites:

ProxyRemoteMatch "^https?://(?!(.*\.)?mycompany\.local\b)"
"http://10.5.20.1:8080"

This regular expression is quite "expensive" since it uses a negative
lookahead so, this solution is sub-optimal.

Carsten




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: Strange behavior with directives ProxyRemote and NoProxy [ In reply to ]
Hello,

On Fri, May 5, 2023 at 9:22?AM Carsten Klein <c.klein@datagis.com> wrote:
>
> Important(?) side note: through DNS the server can only resolve
> local/intranet names and addresses. The DNS refuses to resolve
> external/Internet names and addresses.

Unless NoProxy contains only domain names (e.g. ".mycompany.local")
which can be compared verbatim, there will be a DNS resolution for the
requested host. And if that DNS resolution fails, NoProxy does not
apply (i.e. ProxyRemote is used).

>
> According to the docs, configuring ProxyRemote and NoProxy should be
> quite simple:
>
> # All requests go through the company's proxy
> ProxyRemote "*" "http://10.5.10.20:8080"
>
> # Direct requests to all intranet hosts
> NoProxy ".mycompany.local" "10.0.0.0/8"

So here if the requested host does not end in ".mycompany.local", it
will be resolved and compared to the network address.
Your configuration depends on DNS, more exactly it depends on DNS to
work at least for local/intranet hosts (failures on remote ones
shouldn't be an issue but looks fragile and not optimal. It's broken
if the DNS does not fail but returns a 10/8 address for whatever
reason though).

I would try to only set:
NoProxy ".mycompany.local"
to exclude DNS from the game and see what happens for requests to this
domain at least. If it works for those and you still need to also
match "10.0.0.0/8" for requests using local IP addresses directly or
other/unknown/unlistable local domain names, you probably should have
a look at how hosts are resolved on the local DNS when requests are
misdirected.


Regards;
Yann.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: Strange behavior with directives ProxyRemote and NoProxy [ In reply to ]
Hello Yann,

thanks for your comments :)

> Hello,
>
> On Fri, May 5, 2023 at 9:22?AM Carsten Klein <c.klein@datagis.com> wrote:
>>
>> Important(?) side note: through DNS the server can only resolve
>> local/intranet names and addresses. The DNS refuses to resolve
>> external/Internet names and addresses.
>
> Unless NoProxy contains only domain names (e.g. ".mycompany.local")
> which can be compared verbatim, there will be a DNS resolution for the
> requested host. And if that DNS resolution fails, NoProxy does not
> apply (i.e. ProxyRemote is used).
>
>>
>> According to the docs, configuring ProxyRemote and NoProxy should be
>> quite simple:
>>
>> # All requests go through the company's proxy
>> ProxyRemote "*" "http://10.5.10.20:8080"
>>
>> # Direct requests to all intranet hosts
>> NoProxy ".mycompany.local" "10.0.0.0/8"
>
> So here if the requested host does not end in ".mycompany.local", it
> will be resolved and compared to the network address.
> Your configuration depends on DNS, more exactly it depends on DNS to
> work at least for local/intranet hosts (failures on remote ones
> shouldn't be an issue but looks fragile and not optimal. It's broken
> if the DNS does not fail but returns a 10/8 address for whatever
> reason though).
>
> I would try to only set:
> NoProxy ".mycompany.local"
> to exclude DNS from the game and see what happens for requests to this
> domain at least. If it works for those and you still need to also
> match "10.0.0.0/8" for requests using local IP addresses directly or
> other/unknown/unlistable local domain names, you probably should have
> a look at how hosts are resolved on the local DNS when requests are
> misdirected.

External requests (through ProxyRemote) do actually NOT work when
NoProxy is set to just ".mycompany.local". According to what you've
said, DNS is not part of the game here.

However, external requests DO work when NoProxy is left unset or set to
a different (not my local but non-existing) domain, e.g.
".notmycompany.local".

Even more strange: external requests DO work if NoProxy is set to the
domain or hostname of the host that serves the external request:

NoProxy ".google.com" -> requesting 'https://www.google.com' works!
NoProxy "www.google.com" -> requesting 'https://www.google.com' works!

All things considered, NoProxy has only two effects (using names only):

Setting to

1. my local domain ".mycompany.local" -> remote proxy is NEVER used
2. anything else (including unset) -> remote proxy is ALWAYS used

So, NoProxy is not of much help in this scenario.

Since this works with all other software on this host (Apache Tomcat,
curl, wget, etc.), this seems to be a bug in Apache httpd (although
quite hard to believe).

Do you (or someone else) know where that decision algorithm is actually
implemented in those many source files?

Can you (or someone else) setup an environment to test this in order to
confirm or refute my findings?

Regards,

Carsten

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: Strange behavior with directives ProxyRemote and NoProxy [ In reply to ]
Hi there,

I've tracked this down to the sources now. Did not find any obvious errors.

Following line numbers are those of version/tag 2.4.52.

However, it seems like in mod_proxy.c, lines 1402 to 1405,
´direct_connect´ is constantly set to TRUE, if and only if NoProxy is
set to my local domain (e.g. ".mycompany.local"). It seems to be set to
FALSE in all other cases.

Here are the log entries (shortened, date, pid:, tid: and client:) for
some different domain values of NoProxy:

NoProxy ".mycompany.local"

[<date>] [proxy:trace2] proxy_util.c(2359): *: using default reverse
proxy worker for https://www.geoportal-raumordnung-bw.de/ows/services...
(no keepalive)
[<date>] [proxy:debug] mod_proxy.c(1503): AH01143: Running scheme https
handler (attempt 0)
[<date>] [proxy:debug] proxy_util.c(2531): AH00942: https: has acquired
connection for (*)
[<date>] [proxy:debug] proxy_util.c(2587): AH00944: connecting
https://www.geoportal-raumordnung-bw.de/ows/services... to
www.geoportal-raumordnung-bw.de:443
[<date>] [proxy:debug] proxy_util.c(2810): AH00947: connected
/ows/services... to www.geoportal-raumordnung-bw.de:443
[<date>] [proxy:trace2] proxy_util.c(3244): https: fam 2 socket created
to connect to *
...
waiting for timeout ... ... ... ... ...
...
[<date>] [proxy:error] (110)Connection timed out: AH00957: https:
attempt to connect to 5.9.89.16:443 (*) failed
[<date>] [proxy_http:error] AH01114: HTTP: failed to make connection to
backend: www.geoportal-raumordnung-bw.de
[<date>] [proxy:debug] proxy_util.c(2546): AH00943: https: has released
connection for (*)

Did not try to use the ProxyRemote proxy server. Why not? Domain
"geoportal-raumordnung-bw.de" is NOT EQUAL to "mycompany.local", is it?



Setting NoProxy to anything else (including
".geoportal-raumordnung-bw.de"!) makes httpd use the ProxyRemote for ALL
requests!


NoProxy ".geoportal-raumordnung-bw.de"

[<date>] [proxy:trace2] proxy_util.c(2359): *: using default reverse
proxy worker for https://www.geoportal-raumordnung-bw.de/ows/services...
(no keepalive)
[<date>] [proxy:debug] mod_proxy.c(1453): AH01142: Trying to run
scheme_handler against proxy
[<date>] [proxy:debug] proxy_util.c(2531): AH00942: https: has acquired
connection for (*)
[<date>] [proxy:debug] proxy_util.c(2587): AH00944: connecting
https://www.geoportal-raumordnung-bw.de/ows/services... to
www.geoportal-raumordnung-bw.de:443
[<date>] [proxy:debug] proxy_util.c(2810): AH00947: connected
/ows/services... to 10.5.20.100:8080 // the ProxyRemote!
[<date>] [proxy:trace2] proxy_util.c(3244): https: fam 2 socket created
to connect to *
[<date>] [proxy:debug] proxy_util.c(3276): AH02824: https: connection
established with 10.5.20.1:8080 (*)
[<date>] [proxy:debug] proxy_util.c(2903): AH00948: CONNECT: sending the
CONNECT request for www.geoportal-raumordnung-bw.de:443 to the remote
proxy 10.5.20.100:8080 (10.5.20.100)
[<date>] [proxy:debug] proxy_util.c(2959): AH00949: send_http_connect:
response from the forward proxy: HTTP/1.0 200 Connection
established\r\nProxy-Agent: Fortinet-Proxy/1.0\r\n\r\n
[<date>] [proxy:trace1] proxy_util.c(3450): [remote 10.5.20.100:8080]
https: set SNI to www.geoportal-raumordnung-bw.de for (10.5.20.100)
[<date>] [proxy:debug] proxy_util.c(3462): AH00962: https: connection
complete to 10.5.20.100:8080 (10.5.20.100)
[<date>] [proxy:debug] proxy_util.c(2546): AH00943: *: has released
connection for (*)
[<date>] [proxy:debug] proxy_util.c(3386): [remote 10.5.20.100:8080]
AH02642: proxy: connection shutdown


Both ´ap_proxy_is_domainname´ and ´proxy_match_domainname´ (the matcher
function for domain names) in proxy_util.c seem to be correct.

No Idea what's going on here.

Debian/Ubuntu apply a bunch of patches to the apache2 package. Maybe
they patch it to death...

Is there anything else I could be missing?

Regards,

Carsten

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: Strange behavior with directives ProxyRemote and NoProxy [ In reply to ]
Hi there,

Update: AFAICS, Debian/Ubuntu patches do nothing with NoProxy.

I've tested this on a different server (also an Ubuntu 22.04 LTS box
with Apache httpd 2.4.52) in a completely different network. The NoProxy
directive behaves slightly different on that host.

On that host, NoProxy seems to have absolutely NO effect. No matter what
I'm configuring, the ProxyRemote is ALWAYS used. (On the other host, I
could prevent ProxyRemote from being used by adding my local domain or
the local subnet to NoProxy).

ON the new host, I've tried with domains, hostnames, IP-Addresses,
Subnets... (DNS works fine on that server for all hosts).

Anyone has an idea what's wrong with that configuration?


ProxyRemote "https" "http://192.168.2.1:3128"
NoProxy ".geoportal-raumordnung-bw.de"

ProxyPass "/foo/bar"
"https://www.geoportal-raumordnung-bw.de/ows/services/org.1.09570b44-6616-4482-8680-90743239483d_wms"

Actually, this configuration uses the proxy server 192.168.2.1:3128 for
any requests to "/foo/bar".


Maybe reading in the NoProxy configuration is somehow broken? Too bad,
that most of the debug statements in set_proxy_dirconn (mod_proxy.c) are
located inside an ´#if DEBUGGING´ condition. So, I have no way to see
those with my standard binaries.

Has anyone out there actually managed to use ProxyRemote together with
NoProxy in a useful way?

Regards,

Carsten

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org