Btrfs requires noatime
Traditionally, UNIX filesystems also maintain the access time
(atime) of a file. This is very much an anti-feature because it
yields a write operation for each read operation. Which is
obviously bad for performance. Since Btrfs is a copy-on-write
(COW) filesystem maintaining atimes is even more painful. Thus, a
sensible recommendation is to make sure that every filesystem is
mounted with the noatime
option. Especially, if it is a Btrfs
filesystem.
Btrfs Case Study¶
A typical example of how writing the atime can significantly affect performance.
The symptoms: an rsync job from a Btrfs filesystem
(located on an SSD) to an external USB 3 disk drive with XFS runs
with ~ 17 MiB/s while iotop
and dstat
show significantly
higher IO values for the source device, e.g. ~ 60 MiB/s.
Reasons: This Btrfs filesystem is mounted with default options,
that means that instead of noatime
the relatime
is active,
resulting in atime updates for effectively each file.
In addition to that, the filesystem contains some snapshots, i.e. for all subvolumes there is also one or more snapshots. Creating a filesystem snapshot is very cheap on Btrfs because of its COW design. An atime update of a snapshotted file only updates the atime of one copy. Thus, in the likely case that the filesystem sector with the atime is still shared it has to be copied on that atime write. Meaning 2 or more write operations as a result of one read.
Resolution: After canceling the rsync command,
remounting the Btrfs filesystem with noatime
and restarting
the rsync job it sure enough performs much better, i.e.
the numbers reported by rsync match the dstat ones, as expected.
Note that the number of files included in a snapshot is most relevant for this issue, not necessarily the number of snapshots.
If you are really unlucky, running a backup job (or even just a grep) on a Btrfs filesystem mounted without noatime might even yield out-of-space errors in case all the copy-on-write atime updates exceed the available free space.
Relatime¶
Since 2009 or so Linux kernels by default mount filesystems with
the relatime
option. With this options, the atime is updated
somewhat less frequently. That means under the presence of reads,
it is updated at least once a day:
relatime Update inode access times relative to modify or change time. Access time is only updated if the previous access time was earlier than the current modify or change time. (Similar to noatime, but it doesn't break mutt or other applications that need to know if a file has been read since the last time it was modified.) Since Linux 2.6.30, the kernel defaults to the behavior provided by this option (unless noatime was specified), and the strictatime option is required to obtain traditional semantics. In addition, since Linux 2.6.30, the file's last access time is always updated if it is more than 1 day old.
(mount(8))
Thus, relatime
doesn't help much with the principal cause for
the described performance issue. You still likely get massive
atime update induced writes during bulk read-only activity like
backup or search jobs. You only have to experience a timespan
greater than 24h were most files aren't read. A pretty standard
scenario.
Unfortunately, switching the system default from relatime
to noatime
isn't
possible (as of 2017-02). Thus, one has to remember to always specify
noatime
. Think mount -o noatime ...
and add it to each /etc/fstab
entry.
Testing¶
Testing the effect of atime uptime in the presence of snapshots
is easier when the Btrfs filesystem is mounted with
strictatime
. This overrides the relatime
default option.
The time attributes of a file or directory can be displayed with
ls
but the GNU utility stat
is more convenient for printing
all timestamps.
P.S.¶
There are use-cases for an actively maintained file atime. Examples are measuring the usage of installed binaries (cf. Debian's popcontest) or detecting unread mails. But there are better alternatives to implement such tasks and thus those aren't very convincing arguments for enabling atime writing, by default (be it relative or even strict). In conclusion, writing the access time means much pain with little gain.
See also¶
- Atime and btrfs: a bad combination? - this is also an counter-example to Betteridge's law of headlines