[squid-users] Upper limit on the number of regular expressions in url_regex?

Ralf Hildebrandt Ralf.Hildebrandt at charite.de
Tue Aug 8 14:24:20 UTC 2017


I'm using this in squid-5.0:

acl markRw_urlbl annotate_transaction accessRule=rw_urlbl
acl rw_urlbl url_regex "/etc/squid5/generated-rw_urlbl.acl"
http_access deny rw_urlbl markRw_urlbl
deny_info   http://proxy.charite.de/rw_urlbl/ markRw_urlbl
# https://ransomwaretracker.abuse.ch/blocklist/ 30.3.16 RHI

And yes, it's quite big:

# wc -l /etc/squid5/generated-rw_urlbl.acl
10783 /etc/squid5/generated-rw_urlbl.acl

During reconfigure I noticed:

2017/08/08 15:56:45.413| WARNING: optimisation of regular expressions failed; using fallback method without optimisation

Now I increased debug_options (to 28,9) and found that squid is
repeatedly grouping the regular expressions until a buffer is "full",
the last such log entry is:

2017/08/08 15:56:45.413| 28,2| RegexData.cc(188) compileOptimisedREs: adding RE 'http://zzzort10xtest123.com/nin5k3bwo'
2017/08/08 15:56:45.413| 28,2| RegexData.cc(194) compileOptimisedREs: buffer full, generating new optimised RE...
2017/08/08 15:56:45.413| 28,2| RegexData.cc(125) compileRE: compiled '(http://zizicamarda.com/7fg3g)|(http://zizzhaida.com/3m6ij)|(http://zizzhaida.com/98g4ubq)|(http://zizzhaida.com/a0s9b)|(http://zjscs.org/oax
qpo4w7)|(http://zlotysalmo.net/0zx0ken3)|(http://zlotysalmo.net/3v8va8ov)|(http://zlotysalmo.net/75vepy6f)|(http://zlotysalmo.net/9v50aob)|(http://znany-lekarz.pl/wd7zj)|(http://zoekeith.com/qehggefyb)|(http://z
ona-sezona.com.ua/hj1lsp)|(http://zonabest.atspace.com/353wxy)|(http://zonnit.com/qargy9n)|(http://zoologiczny.cba.pl/okp987g7v)|(http://zoomwalls.com/k8j3tpoe)|(http://zoomwalls.com/zghpzv2f)|(http://zoonhers.n
et/3oojm4)|(http://zoonhers.net/4susie)|(http://zoonhers.net/5ngvr)|(http://zophotos.com/098tb)|(http://zorgboerderijtzicht.nl/lm3mhz)|(http://zpwang.net/9igbmnn)|(http://zsgxbgj.com/1324w)|(http://zsnbystre.rep
ublika.pl/988g765f)|(http://zsp17.y0.pl/jkYTFhb7)|(http://zsz_szyn.republika.pl/G7vuYhjb)|(http://zuerich-gewerbe.ch/mbv58gbv)|(http://zui9reica.web.fc2.com/87hcrn33g)|(http://zurrmax.de/hwajuip)|(http://zwei.au
dio/87h78rf33g)|(http://zwljfc.com/8765r)|(http://zyasf.com/cir9dl)|(http://zytrade.cn/1324w)|(http://zytrade.cn/aust7a6ik)|(http://zzzort10xtest123.com/nin5k3bwo)'
with flags 9

http://zzzort10xtest123.com/nin5k3bwo being the last line in the file
/etc/squid5/generated-rw_urlbl.acl 

The last compilation seems to fail, and the next line in the log is:

2017/08/08 15:56:45.413| WARNING: optimisation of regular expressions failed; using fallback method without optimisation

whereupon each line becomes it's own RE:

2017/08/08 15:56:45.430| 28,2| RegexData.cc(125) compileRE: compiled 'http://00005ik.rcomhost.com/7fg3g' with flags 9
2017/08/08 15:56:45.431| 28,2| RegexData.cc(125) compileRE: compiled 'http://01ad681.netsolhost.com/7j0jlq3' with flags 9
2017/08/08 15:56:45.431| 28,2| RegexData.cc(125) compileRE: compiled 'http://023pc.cn/8hrnv3' with flags 9
2017/08/08 15:56:45.431| 28,2| RegexData.cc(125) compileRE: compiled 'http://027tzx.com/lscpv' with flags 9
...

But why is it failing?

Background:
===========

Running squid with > 10000 regular expressions causes all kinds of
strange behaviour - that'S why I noticed the problem in the first place.

-- 
Ralf Hildebrandt                   Charite Universitätsmedizin Berlin
ralf.hildebrandt at charite.de        Campus Benjamin Franklin
https://www.charite.de             Hindenburgdamm 30, 12203 Berlin
Geschäftsbereich IT, Abt. Netzwerk fon: +49-30-450.570.155


More information about the squid-users mailing list