[squid-dev] Injecting custom JavaScript

Thu Jun 18 12:41:18 UTC 2015

Hi Amos,

> On 18/06/2015 11:42 p.m., James Hunter wrote:
>> Hi,
>>
>> I've been looking to inject special JavaScript code into every HTML page
>> my squid proxy receives via HTTPS connections, this is for an
>> application where the users will be fully aware of the injection.
> Lets start with how this is a truly terrible idea.
>
> The content you are seeking to change is:
> a) somebody elses property,
> b) copyrighted as such,
> c) potentially subject to external checksum and digital protections,
> d) HTTP relies on it being unchanged from origin server copies for
> correct revalidation operations.
This is for stress testing tool that requires the knowledge of what the 
user is doing to record (at a browser event level) the keyboard and 
mouse activities, this occurs on just a single authorized domain that 
they must have signed consent for. If they use the proxy for a domain 
that that is not authorized it redirects them to our web app to seek 
authorization. There is no way to do this without injecting our own 
Javascript library to push those events to a recording application.

Hopefully that's not a truly terrible one :) just a necessity for us

>
>> I've correctly configured Squid to do the SSL Bump, having verified in
>> WireShark that the two sides are communicating via separate connections.
> That means little these days. Splice works via two connections and Squid
> does not participate in the TLS layer inside those connections.
>
> You have to also confirm that Squid has access to the decrypted data,
> that shows up in wireshark as different sets of crypto operating on each
> connection.
I will verify this, I was wondering where (from a C++ level) I can gain 
access to the segments of each request - I couldn't see in the source 
where the "data segment" is available, could anyone point me to the file 
where the data is available in it's raw format (as a void* or similar)

>
>> Can someone point out where the plain HTTP / TCP request flows through
>> squid, after it's deciphered by one side - but before it's encrypted for
>> the other? I want to scan the buffer's to find any <BODY> tags and
>> ensure that the script is inserted.
> The "page" is not plaintext HTTP. Its binary payload. Squid is designed
> explicitly NOT to touch them.
>
> It is also relayed as given, spread over many smaller buffer segments
> and very likely compressed. There is 0 guarantee that you will be able
> to see a whole sequence of "<BODY>" characters as a string even if one
> existed bare in the payload.
>
> If you want to do payload adaptation use an ICAP service or eCAP module.
> The above guarantee is still not provided, but the *CAP APIs provide
> easier ways to access the transaction data.
> NP: Dont forget to skip altering any and all messages with
> "Cache-Control:no-transform" header - that is critical.
Understood, we will have to create some sort of bitstream analyzer that 
can handle separate segments of binary data (for the same URL).

I will investigate the ICAP and eCAP facilities if I can't get direct 
access to the data from within the code.

James