[squid-users] mmap() in squid
Eugene M. Zheganin
emz at norma.perm.ru
Fri Mar 27 13:57:20 UTC 2015
Hi.
Squid uses mmap() call from 3.4.x, and mmap() on FreeBSD it has one
specific flag - MAP_NOSYNC, which prevents dirtied pages from being
flushed on disk:
MAP_NOSYNC Causes data dirtied via this VM map to be flushed to
physical media only when necessary (usually by the
pager) rather than gratuitously. Typically this pre-
vents the update daemons from flushing pages dirtied
through such maps and thus allows efficient
sharing of
memory across unassociated processes using a file-
backed shared memory map. Without this option any VM
pages you dirty may be flushed to disk every so often
(every 30-60 seconds usually) which can create
perfor-
mance problems if you do not need that to occur (such
as when you are using shared file-backed mmap regions
for IPC purposes). Note that VM/file system
coherency
is maintained whether you use MAP_NOSYNC or not.
This
option is not portable across UNIX platforms (yet),
though some may implement the same behavior by
default.
WARNING! Extending a file with ftruncate(2),
thus cre-
ating a big hole, and then filling the hole by
modify-
ing a shared mmap() can lead to severe file
fragmenta-
tion. In order to avoid such fragmentation you
should
always pre-allocate the file's backing store by
write()ing zero's into the newly extended area
prior to
modifying the area via your mmap(). The
fragmentation
problem is especially sensitive to MAP_NOSYNC pages,
because pages may be flushed to disk in a totally
ran-
dom order.
The same applies when using MAP_NOSYNC to implement a
file-based shared memory store. It is
recommended that
you create the backing store by write()ing zero's to
the backing file rather than ftruncate()ing it. You
can test file fragmentation by observing the KB/t
(kilobytes per transfer) results from an ``iostat 1''
while reading a large file sequentially, e.g. using
``dd if=filename of=/dev/null bs=32k''.
The fsync(2) system call will flush all dirty
data and
metadata associated with a file, including dirty
NOSYNC
VM data, to physical media. The sync(8) command and
sync(2) system call generally do not flush dirty
NOSYNC
VM data. The msync(2) system call is obsolete since
BSD implements a coherent file system buffer cache.
However, it may be used to associate dirty VM pages
with file system buffers and thus cause them to be
flushed to physical media sooner rather than later.
Last year there was an issue with PostgreSQL, which laso started to use
mmap() in it's 9.3 release, and it had a huge regression issue on
FreeBSD. One of the measures to fight this regression (but not the only)
was adding MAP_NOSYNC to the postgresql port. So I decided to do the
same for my local squid. I created a patch, where both of two
occurencies of mmap() were supplied with this flag. I'm using squid
3.4.x patched this way about a half-a-year. Couple of days ago I sent
this patch to the FreeBSD ports system, and squid port maintainer asks
me if I'm sure squid on FreeBSD does need this. Since I'm not a skilled
programmer (though I think using mmap() with MAP_NOSYNC is a good
thing), I decided to ask here - is this flag worth bothering, since
squid isn't a database engine ?
Thanks.
More information about the squid-users
mailing list