[squid-users] mmap() in squid

Eugene M. Zheganin emz at norma.perm.ru
Fri Mar 27 13:57:20 UTC 2015


Hi.

Squid uses mmap() call from 3.4.x, and mmap() on FreeBSD it has one
specific flag - MAP_NOSYNC, which prevents dirtied pages from being
flushed on disk:

MAP_NOSYNC        Causes data dirtied via this VM map to be flushed to
                       physical media only when necessary (usually by the
                       pager) rather than gratuitously.  Typically this pre-
                       vents the update daemons from flushing pages dirtied
                       through such maps and thus allows efficient
sharing of
                       memory across unassociated processes using a file-
                       backed shared memory map.  Without this option any VM
                       pages you dirty may be flushed to disk every so often
                       (every 30-60 seconds usually) which can create
perfor-
                       mance problems if you do not need that to occur (such
                       as when you are using shared file-backed mmap regions
                       for IPC purposes).  Note that VM/file system
coherency
                       is maintained whether you use MAP_NOSYNC or not. 
This
                       option is not portable across UNIX platforms (yet),
                       though some may implement the same behavior by
default.

                       WARNING!  Extending a file with ftruncate(2),
thus cre-
                       ating a big hole, and then filling the hole by
modify-
                       ing a shared mmap() can lead to severe file
fragmenta-
                       tion.  In order to avoid such fragmentation you
should
                       always pre-allocate the file's backing store by
                       write()ing zero's into the newly extended area
prior to
                       modifying the area via your mmap().  The
fragmentation
                       problem is especially sensitive to MAP_NOSYNC pages,
                       because pages may be flushed to disk in a totally
ran-
                       dom order.

                       The same applies when using MAP_NOSYNC to implement a
                       file-based shared memory store.  It is
recommended that
                       you create the backing store by write()ing zero's to
                       the backing file rather than ftruncate()ing it.  You
                       can test file fragmentation by observing the KB/t
                       (kilobytes per transfer) results from an ``iostat 1''
                       while reading a large file sequentially, e.g. using
                       ``dd if=filename of=/dev/null bs=32k''.

                       The fsync(2) system call will flush all dirty
data and
                       metadata associated with a file, including dirty
NOSYNC
                       VM data, to physical media.  The sync(8) command and
                       sync(2) system call generally do not flush dirty
NOSYNC
                       VM data.  The msync(2) system call is obsolete since
                       BSD implements a coherent file system buffer cache.
                       However, it may be used to associate dirty VM pages
                       with file system buffers and thus cause them to be
                       flushed to physical media sooner rather than later.

Last year there was an issue with PostgreSQL, which laso started to use
mmap() in it's 9.3 release, and it had a huge regression issue on
FreeBSD. One of the measures to fight this regression (but not the only)
was adding MAP_NOSYNC to the postgresql port. So I decided to do the
same for my local squid. I created a patch, where both of two
occurencies of mmap() were supplied with this flag. I'm using squid
3.4.x patched this way about a half-a-year. Couple of days ago I sent
this patch to the FreeBSD ports system, and squid port maintainer asks
me if I'm sure squid on FreeBSD does need this. Since I'm not a skilled
programmer (though I think using mmap() with MAP_NOSYNC is a good
thing), I decided to ask here - is this flag worth bothering, since
squid isn't a database engine ?

Thanks.


More information about the squid-users mailing list