mod_extract_forwarded is designed to transparently modify a connection so
that it looks like it came from the IP behind a proxy server rather than
the proxy itself. This affects all subsequent stages of request processing
including access control, logging, and CGIs. It relies on the
"X-Forwarded-For" header to do this. This header should be added by all
well-behaved proxies. If the proxy doesn't add it, we can't do anything
about it.

It's possible for a request to pass thru multiple proxies on its request
path in which case X-Forwarded-For should contain multiple IPs. The
leftmost IP in the list is the originating client IP[1], that is the one
mod_extract_forwarded will use.

Since we are altering the connection record to remove references to the
actual connecting IP, it might be useful to remember that IP somewhere. We
store it in the environment variable PROXY_ADDR immediately before
altering the connection record. So CGIs have access to PROXY_ADDR if they
need it. Other Apache modules can also get to PROXY_ADDR via the
request_rec's subprocess_env table.

Using this module has potentially serious implications for host-based
access control to your server. Since "X-Forwarded-For" is just a piece of
text in a request header spoofing it is trivial. To compensate for this
mod_extract_forwarded provides configuration directives to restrict the
proxy hosts for which X-Forwarded-For will be processed. Disallowing a
proxy host with these directives doesn't mean the proxy can't get pages
from your server, it just means the forwarded IP won't be used. It is
_strongly_ advised that you only process "X-Forwarded-For" from proxies
you trust.

If a request has passed through multiple proxies then the X-Forwarded-For
may contain several IPs like this:

X-Forwarded-For: client1, proxy1, proxy2

_All_ the IPs, with the exception of the originating client, must be in
the allow list. If proxy1 and proxy2 are in the allow list then the above
header is safe, but the below is not:

X-Forwarded-For: client1, untrusted1, proxy1, proxy2

Strictly speaking, if we have a list of IPs: 

ip_0, ip_1, ... ip_n

then the set of IPs ip_1, ... ip_n must be a subset of the allow list.

There is also a directive for allow or disallowing the proxies to cache
the returned content. You may find yourself in a situation where there may
be some hosts behind a caching proxy which are allowed access to a URI but
other hosts behind the same proxy which are not allowed. If the proxy
caches the content when it responds to an allowed client, it might not
re-check with your server before giving the cached content to another
client.[2] If you find yourself in this situation use
AllowForwarderCaching (described below) to deactivate caching for
locations with protected content.

The configuration directives are as follows:

AllowForwarderCaching: On or Off - will we allow any caches along the
request path to cache this response?

AddAcceptForwarder: add this IP or hostname to the list of hosts from
which we will honor the "X-Forwarded-For" header.

RemoveAcceptForwarder: remove this IP or hostname from the list of hosts
built with AddAcceptForwarder.

If no directives are found the default is to ignore X-Forwarded-For from
all proxies and to allow caching.[3] In other words if you load the module
but don't configure it, it doesn't do anything.

Like many other directives you may specify these at "top level" outside
any container directives, then you may specify overriding directives
inside containers. You may also use them inside .htaccess files if
AllowOverride Options is in effect. The effect of the directives is
cumulative. Example:

AddAcceptForwarder 10.0.0.1
AddAcceptForwarder 10.0.0.2
AllowForwarderCaching On

<Location /foobar>
RemoveAcceptForwarder 10.0.0.2
AllowForwarderCaching Off
</Location>

So now inside /foobar 10.0.0.1 is still accepted but 10.0.0.2 is not, but
10.0.0.1 is not allowed to cache responses. (Perhaps /foobar contains some
sensitive content.)

AddAcceptForwarder and RemoveAcceptForwarder also take an "all" keyword
which does exactly that - "AddAcceptForwarder all" will accept
X-Forwarded-For from all proxies and "RemoveAcceptForwarder all" will
totally blank the accept list. The "all" keyword makes possible something
like this:

AddAcceptForwarder all
RemoveAcceptFowarder 10.0.0.3

<Location /foobaz>
RemoveAcceptForwarder all
AddAcceptFowarder 10.0.0.4
</Location>

At top level we accept from all proxies _except_ 10.0.0.3. Inside /foobaz
we totally blank the accept list and then accept from _only_ 10.0.0.4.

The ordering in which the directives appear in a container is not fixed,
but the order of their processing is explicit:

(1) RemoveAcceptForwarder all
(2) AddAcceptForwarder all
(3) any other RemoveAcceptForwarders
(4) any other AddAcceptForwarders

Those rules might seem restrictive at first but in practice you won't even
have to know them. They are based on the premise that you won't try to
remove your own AddAcceptForwarders inside the same container. If you do
(why?!) you won't get what your expect. But on the whole the rules will
behave very naturally.


[1] From examining the squid source, I think the leftmost IP is the
correct one. I haven't been able to empirically test that. If someone
could confirm that or correct me, I would appreciate it.

[2] I wasn't able to get squid 2.2 to do this (i.e. give cached content to
a disallowed client) but I also haven't tried any other proxies. The
possibility is always there. Once the content is cached, you've lost
control of it.

[3] The reasoning here is that since we're not honoring any
X-Forwarded-Fors we can't be spoofed, so caching is safe. As soon as you
put in your first AddAcceptForwarder you should think about caching.


ahosey@systhug.com
$Id: README,v 1.4 2000/06/02 21:17:29 ahosey Exp $