[squid-users] Content Adaptation with HTTPs
Amos Jeffries
squid3 at treenet.co.nz
Sun Aug 20 08:19:11 UTC 2017
On 20/08/17 16:05, Christopher Ahrens wrote:
>
> The current solution doesn't work for me since it only supports a very
> limited number of clients. I am working with a charity that provides
> internet services to those with impaired vision, the intention of my
> project was to set up a semi-public proxy for recipient of the charity
> (EG, we would install DD-WRT like routers within their homes that would
> create a tunnel into our network so that they could browse the internet
> using off-the-shelf systems. We recently received a large number of
> tablets form a corporate donor, the tablets themselves will work for our
> recipients, but unfortunately the internet at large does not.
FYI: If you can get the adaptation part to be small enough a non-caching
Squid should be able to run on those WRT-like devices with under 32 MB
of RAM needed. So the tunnel may not be necessary, just a way to update
the software and its config.
>
> We've looked into commercial systems in the past, but we cannot afford
> the cost of commercial systems, especially since we are unsure about the
> exact licensing that would be needed for our endeavor. We have also
> been burnt in the past with commercial software where the project either
> goes dead, begins to require insanely expensive appliances, or the
> license price is sent sky-high.
>
> Would it be possible to use a setup of Squid <-> Privoxy <-> Squid to
> execute this? I figure we'd build an internal instance that will handle
> the client<->proxy part, Privoxy handles the content modification, then
> a second Squid instance to handle the web server<->proxy part.
Squid will only send SSL-Bump'ed HTTPS traffic over encrypted
connections. So that is only possible if privoxy accepts TLS connections
from Squid. In which case you probably do not need the second Squid, as
privoxy would also be doing the HTTPS to-server part easily enough itself.
>
> SO it looks like the solution would be to find a developer to write an
> ECAP to cycle through regexes to replace/remove HTML/CSS content. So
> time to dig out my old C++ books and get to work...
If the existing ICAP/eCAP options are not suitable, then yes a custom
one would be needed.
It is not as easy as a few regex replacements though. Adaptors are
streamed the full on-wire HTTP message format with only minor
sanitization by Squids parser. To alter the content you will have to
deal with data encodings, object ranges, partially received objects. And
it is best to assume everything is of infinite length unless explicitly
told otherwise - so no buffer-then-adapt code.
eCAP is simpler than ICAP, but still has to deal with these HTTP features.
Those are a big part of why available software is so sparse. The other
part being that HTTP traffic payloads are copyright content, so there
are legal issues with selling software for the purpose of altering
copyright content sans authors permission.
Amos
More information about the squid-users
mailing list