[squid-users] Pull/Fetch high level URL requests from Squid access.log without getting all the object hits

Amos Jeffries squid3 at treenet.co.nz
Thu Nov 20 14:07:29 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21/11/2014 1:18 a.m., Swapneel Patnekar wrote:
> Hi there,
> 
> I need to pull/fetch high level URL requests from the Squid
> access.log i.e URL requests which were typed by the user in the
> browser.
> 
> For example, if the user had typed facebook.com, I want to
> pull/fetch only facebook.com from the access.log and not 
> https://fbstatic-a.akamaihd.net/rsrc.php/v2/yV/r/aXwjx2fqSf4.css
> etc which was not typed by the user in the browser but was
> referenced by facebook.com for the CSS.
> 
> Can this be done ?

No it can't.

There is absolutely no way for Squid to identify what the user (if one
even exists) has done with their keyboard (or shortcuts, or bookmarks,
or search bar or...) that started the HTTP to happen.

You can log the "Referer" header contents, requests with no Referer
value are usually "first" requests. But that header is not always sent
or is sent when users change "page". So the accuracy is very low for
what you are asking to get out of it.

FWIW: "facebook.com" is a rarely visited page. Most of the users
search for it by company name and click the results, or use history
and bookmarks, even the emailed links FB sends out. All of which dump
them straight into the middle of some sub-section of the FB site.

Amos

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBAgAGBQJUbfWhAAoJELJo5wb/XPRjaeQH/0uWTCtDq9DNvNNXUVo103g4
nFAvaT5kcaEJxRNOCTMerLwSAIrqyhT+SpqnmbSJURskwmW8vZRodIAnQPXPfiCj
QuAazwDywVq0n7SyAtyIzyK0I6qlVtuKD+3VHcCJ30AXMv4RUv3ne8WenVLYggOq
KGfTAS5rXUQvnAKpSz+jRGY4ZS7ZJ7dxrrPZwUxsBqXiNAwJpesZVScnxMtiXdsN
Ko+/CMUDA4i35pAsc/l/GtGQozPtlsMtiXm7V5Vg+p9r01gIIaUWA6DuatBuB6RR
QqMsMWKm66HpeD8Kw/MMx81yKOXpEZYRpuGxoxs9CU/rrAd1uugjUR/3D1QLcgs=
=7n1w
-----END PGP SIGNATURE-----


More information about the squid-users mailing list