From: Christoph Anton Mitterer <calestyo@scientia.net>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Reproducer for "compressed data + hole data corruption bug, 2018 edition" still works on 4.20.7
Date: Mon, 04 Mar 2019 16:34:39 +0100
Message-ID: <f9fddae4bc3d59e539b7bc56ae75a5f04a165682.camel@scientia.net> (raw)
In-Reply-To: <20190215054031.GC9995@hungrycats.org>
Hey.
Thanks for your elaborate explanations :-)
On Fri, 2019-02-15 at 00:40 -0500, Zygo Blaxell wrote:
> The problem occurs only on reads. Data that is written to disk will
> be OK, and can be read correctly by a fixed kernel.
>
> A kernel without the fix will give corrupt data on reads with no
> indication of corruption other than the changes to the data itself.
>
> Applications that copy data may read corrupted data and write it back
> to the filesystem. This will make the corruption permanent in the
> copied data.
So that basically means even a cp (without refcopy) or a btrfs
send/receive could already cause permanent silent data corruption.
Of course, only if the conditions you've described below are met.
> Given the age of the bug
Since when was it in the kernel?
> Even
> if
> compression is enabled, the file data must be compressed for the bug
> to
> corrupt it.
Is there a simple way to find files (i.e. pathnames) that were actually
compressed?
> - you never punch holes in files
Is there any "standard application" (like cp, tar, etc.) that would do
this?
> - you never dedupe or clone files
What do you mean by clone? refcopy? Would btrfs snapshots or btrfs
send/receive be affected?
Or is there anything in btrfs itself which does any of the two per
default or on a typical system (i.e. I didn't use dedupe).
Also, did the bug only affect data, or could metadata also be
affected... basically should such filesystems be re-created since they
may also hold corruptions in the meta-data like trees and so on?
> > compression),... or only when specific file operations were done (I
> > did
> > e.g. cp with refcopy, but I think none of the standard tools does
> > hole-
> > punching)?
> That depends on whether you consider fallocate or qemu to be standard
> tools.
I assume you mean the fallocate(1) program,... cause I wouldn't know
whether any of cp/mv/etc. does the system call fallocate(2) per
default.
My scenario looks about the following, and given your explanations, I'd
assume I should probably be safe:
- my normal laptop doesn't use compress, so it's safe anyway
- my cp has an alias to always have --reflink=auto
- two 8TB data archive disks, each with two backup disks to which the
data of the two master disks is btrfs sent/received,... which were
all mounted with compress
- typically I either cp or mv data from the laptop to these disks,
=> should then be safe as the laptop fs didn't use compress,...
- or I directly create the files on the data disks (which use compress)
by means of wget, scp or similar from other sources
=> should be safe, too, as they probably don't do dedupe/hole
punching by default
- or I cp/mv from them camera SD cards, which use some *FAT
=> so again I'd expect that to be fine
- on vacation I had the case that I put large amount of picture/videos
from SD cards to some btrfs-with-compress mobile HDDs, and back home
from these HDDs to my actual data HDDs.
=> here I do have the read / re-write pattern, so data could have
been corrupted if it was compressed + deduped/hole-punched
I'd guess that's anyway not the case (JPEGs/MPEGs don't compress
well)... and AFAIU there would be no deduping/hole-punching
involved here
- on my main data disks, I do snapshots... and these snapshots I
send/receive to the other (also compress-mounted) btrfs disks.
=> could these operations involve deduping/hole-punching and thus the
corruption?
Another thing:
I always store SHA512 hashsums of files as an XATTR of them (like
"directly after" creating such files).
I assume there would be no deduping/hole-punching involved till then,
so the sums should be from correct data, right?
But when I e.g. copy data from SD, to mobile btrfs-HDD and then to the
final archive HDD... corruption could in principle occur when copying
from mobile HDD to archive HDD.
In that case, would a diff between the two show me the corruption? I
guess not because the diff would likely get the same corruption on
read?
> "Ordinary" sparse files (made by seeking forward while writing, as
> done
> by older Unix utilities including cp, tar, rsync, cpio, binutils) do
> not
> trigger this bug. An ordinary sparse file has two distinct data
> extents
> from two different writes separated by a hole which has never
> contained
> file data. A punched hole splits an existing single data extent into
> two
> pieces with a newly created hole between them that replaces
> previously
> existing file data. These actions create different extent reference
> patterns and only the hole-punching one is affected by the bug.
> Files that contain no blocks full of zeros will not be affected by
> fallocate-d-style hole punching (it searches for existing zeros and
> punches holes over them--no zeros, no holes). If the the hole
> punching
> intentionally introduces zeros where zeros did not exist before (e.g.
> qemu
> discard operations on raw image files) then it may trigger the bug.
So long story short, "normal" file operations (cp/mv, etc.) should not
trigger the bug.
qemu with discard would be a prominent example of triggering the bug,
but luckily for me, I only use this on an fs with compress disabled :-D
Any other such prominent examples?
I assume normal mv of refcopy (i.e. cp --reflink=auto) would not punch
holes and thus be not affected?
Further, I'd assume XATTRs couldn't be affected?
So what remains unanswered is send/receive:
> btrfs send and receive may be affected, but I don't use them so I
> don't
> have any experience of the bug related to these tools. It seems from
> reading the btrfs receive code that it lacks any code capable of
> punching
> a hole, but I'm only doing a quick search for words like "punch", not
> a detailed code analysis.
Is there some other developer who possibly knows whether send/receive
would have been vulnerable to the issue?
But since I use send/receive anyway in just one direction from the
master to the backup disks... only the later could be affected.
Thanks,
Chris.