<div dir="ltr">I'll give that regex a try, funny though, that's just built on the code from <a href="http://lightparser.pl">lightparser.pl</a>, must be a problem with the stock code as well, the original 4 entries that were shipped with it are exactly like the one I posted.<div><br></div><div>Thanks,</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><br></div><div>- Marc</div><div><br></div><div>-_-_-_-_-_-_-_-_-_-_-_-</div><div>Marc A. Mapplebeck, MCP/MCDST/MCTS/MCSE/MCDBA/MOS/A+/N+/CNA/CCNA/VCP6-DCV</div><div>ProCom Data</div><div>T: 800-408-3313 x242</div><div>F: 709-256-3031</div><div>E: <a href="mailto:marc.mapplebeck@townsuite.com" target="_blank">marc.mapplebeck@townsuite.com</a></div></div></div></div></div></div></div></div></div></div></div></div>
<br><div class="gmail_quote">On Mon, Mar 28, 2016 at 11:27 PM, Amos Jeffries <span dir="ltr"><<a href="mailto:squid3@treenet.co.nz" target="_blank">squid3@treenet.co.nz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 29/03/2016 2:53 a.m., Marc Mapplebeck wrote:<br>
> I am currently using squid for our proxy, and recently decided to use<br>
> WPAD/PAC to also capture HTTPS traffic. I am having one very annoying<br>
> issue with lightsquid, and wondering if anybody has any insight.<br>
><br>
> All my lightsquid information looks like the attached image. It also does<br>
> not consolidate the first part of the domain name(even this would be fine,<br>
> so that I can differentiate HTTPS traffic, as long as subdomains are<br>
> combined)<br>
><br>
> I have been modifying my <a href="http://lightparser.pl" rel="noreferrer" target="_blank">lightparser.pl</a> file to consolidate subdomains,<br>
> however, this is only working for HTTP traffic, as all HTTPS sites are<br>
> showing the port number like <a href="http://mail.google.ca:443" rel="noreferrer" target="_blank">mail.google.ca:443</a><br>
<br>
</span>That is the correct URL for those requests. And no they are not "HTTPS".<br>
They are tunnels through the proxy to the server and port indicated,<br>
which may or may not have HTTPS inside them.<br>
In fact if that is Google software contacting Google servers it is far<br>
more likely to be SPDY or WebSockets protocol.<br>
<span class=""><br>
<br>
> The code I am using is:<br>
> $url =~ s/([a-z]+:\/\/)??.*\.(google\.*)/$2/o;<br>
><br>
> Has anybody found a way around this or even thought about this? I was<br>
> thinking of telling squid to not include the port, however, it seems to not<br>
> be working. Any other suggestions/thoughts?<br>
<br>
</span>I suggest you double-check your regex. That pattern contains several<br>
major mistakes. "??" and "\.*" for starters.<br>
<<a href="http://www.regexr.com/" rel="noreferrer" target="_blank">http://www.regexr.com/</a>><br>
<br>
The pattern for matching "google.*" in the domain is:<br>
s/^([a-z\-\+]+:\/\/)?([^\/?#:]+)?(google\.[^\/?#:]+)/$3/o<br>
<br>
Amos<br>
<br>
_______________________________________________<br>
squid-users mailing list<br>
<a href="mailto:squid-users@lists.squid-cache.org">squid-users@lists.squid-cache.org</a><br>
<a href="http://lists.squid-cache.org/listinfo/squid-users" rel="noreferrer" target="_blank">http://lists.squid-cache.org/listinfo/squid-users</a><br>
</blockquote></div><br></div>