Message ID | e7fe7225-4f2b-d13e-bb4b-c7db68f63124@redhat.com |
---|---|
State | New |
Headers | show |
Series | xfs: trim IO to found COW exent limit | expand |
Here's a reproducer for the bug. Requires sufficient privs to mount a loopback fs. ====== #!/bin/bash mkdir -p mnt umount mnt &>/dev/null # Create 8g fs image mkfs.xfs -f -dfile,name=fsfile.img,size=8g &>/dev/null mount -o loop fsfile.img mnt # Make all files w/ 1m hints; create original 2m file xfs_io -c "extsize 1048576" mnt xfs_io -c "cowextsize 1048576" mnt echo "Create file mnt/b" xfs_io -f -c "pwrite -S 0x0 0 2m" -c fsync mnt/b &>/dev/null # Make a reflinked copy echo "Reflink copy from mnt/b to mnt/a" cp --reflink=always mnt/b mnt/a echo "Contents of mnt/b" hexdump -C mnt/b # Cycle mount to get stuff out of cache umount mnt mount -o loop fsfile.img mnt # Create a 1m-hinted IO at offset 0, then # do another IO that overlaps but extends past the 1m hint echo "Write to mnt/a" xfs_io -c "pwrite -S 0xa 0k -b 4k 4k" \ -c "pwrite -S 0xa 4k -b 1m 1m" \ mnt/a &>/dev/null xfs_io -c fsync mnt/a echo "Contents of mnt/b now:" hexdump -C mnt/b umount mnt
On Wed, Sep 23, 2020 at 05:35:44PM -0500, Eric Sandeen wrote: > A bug existed in the XFS reflink code between v5.1 and v5.5 in which > the mapping for a COW IO was not trimmed to the mapping of the COW > extent that was found. This resulted in a too-short copy, and > corruption of other files which shared the original extent. > > (This happened only when extent size hints were set, which bypasses > delalloc and led to this code path.) > > This was (inadvertently) fixed upstream with > > 36adcbace24e "xfs: fill out the srcmap in iomap_begin" > > and related patches which moved lots of this functionality to > the iomap subsystem. > > Hence, this is a -stable only patch, targeted to fix this > corruption vector without other major code changes. > > Fixes: 78f0cc9d55cb ("xfs: don't use delalloc extents for COW on files with extsize hints") > Cc: <stable@vger.kernel.org> # 5.4.x > Signed-off-by: Eric Sandeen <sandeen@redhat.com> Looks good to me, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> (Note: Someone please fix the typo "exent" in the subject line.) --D > --- > > I've tested this with a targeted reproducer (in next email) as well as > with xfstests. > > Stable folk, not sure how to send a "stable only" patch, or if that's even > valid. Assuming you're willing to accept it, I would still like to have > some formal Reviewed-by's from the xfs developer community before it gets > merged. > > Big thanks to Darrick & Dave for letting me whine about this bug and > offering suggestions for testing and ultimately, a patch to test. > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 06b9e0aacf54..3289d0f4bb03 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -1002,9 +1002,15 @@ xfs_file_iomap_begin( > * I/O, which must be block aligned, we need to report the > * newly allocated address. If the data fork has a hole, copy > * the COW fork mapping to avoid allocating to the data fork. > + * > + * Otherwise, ensure that the imap range does not extend past > + * the range allocated/found in cmap. > */ > if (directio || imap.br_startblock == HOLESTARTBLOCK) > imap = cmap; > + else > + xfs_trim_extent(&imap, cmap.br_startoff, > + cmap.br_blockcount); > > end_fsb = imap.br_startoff + imap.br_blockcount; > length = XFS_FSB_TO_B(mp, end_fsb) - offset; >
On Wed, Sep 23, 2020 at 05:37:39PM -0500, Eric Sandeen wrote: > Here's a reproducer for the bug. > > Requires sufficient privs to mount a loopback fs. Please turn this into an fstest case, but otherwise this looks reasonable. --D > > ====== > > #!/bin/bash > > mkdir -p mnt > umount mnt &>/dev/null > > # Create 8g fs image > mkfs.xfs -f -dfile,name=fsfile.img,size=8g &>/dev/null > mount -o loop fsfile.img mnt > > # Make all files w/ 1m hints; create original 2m file > xfs_io -c "extsize 1048576" mnt > xfs_io -c "cowextsize 1048576" mnt > > echo "Create file mnt/b" > xfs_io -f -c "pwrite -S 0x0 0 2m" -c fsync mnt/b &>/dev/null > > # Make a reflinked copy > echo "Reflink copy from mnt/b to mnt/a" > cp --reflink=always mnt/b mnt/a > > echo "Contents of mnt/b" > hexdump -C mnt/b > > # Cycle mount to get stuff out of cache > umount mnt > mount -o loop fsfile.img mnt > > # Create a 1m-hinted IO at offset 0, then > # do another IO that overlaps but extends past the 1m hint > echo "Write to mnt/a" > xfs_io -c "pwrite -S 0xa 0k -b 4k 4k" \ > -c "pwrite -S 0xa 4k -b 1m 1m" \ > mnt/a &>/dev/null > > xfs_io -c fsync mnt/a > > echo "Contents of mnt/b now:" > hexdump -C mnt/b > > umount mnt >
On Wed, Sep 23, 2020 at 05:35:44PM -0500, Eric Sandeen wrote: > A bug existed in the XFS reflink code between v5.1 and v5.5 in which > the mapping for a COW IO was not trimmed to the mapping of the COW > extent that was found. This resulted in a too-short copy, and > corruption of other files which shared the original extent. > > (This happened only when extent size hints were set, which bypasses > delalloc and led to this code path.) > > This was (inadvertently) fixed upstream with > > 36adcbace24e "xfs: fill out the srcmap in iomap_begin" > > and related patches which moved lots of this functionality to > the iomap subsystem. > > Hence, this is a -stable only patch, targeted to fix this > corruption vector without other major code changes. > > Fixes: 78f0cc9d55cb ("xfs: don't use delalloc extents for COW on files with extsize hints") > Cc: <stable@vger.kernel.org> # 5.4.x > Signed-off-by: Eric Sandeen <sandeen@redhat.com> Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de> and as Darrick said we'll want to wire up the reproducer for xfstests.
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 06b9e0aacf54..3289d0f4bb03 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1002,9 +1002,15 @@ xfs_file_iomap_begin( * I/O, which must be block aligned, we need to report the * newly allocated address. If the data fork has a hole, copy * the COW fork mapping to avoid allocating to the data fork. + * + * Otherwise, ensure that the imap range does not extend past + * the range allocated/found in cmap. */ if (directio || imap.br_startblock == HOLESTARTBLOCK) imap = cmap; + else + xfs_trim_extent(&imap, cmap.br_startoff, + cmap.br_blockcount); end_fsb = imap.br_startoff + imap.br_blockcount; length = XFS_FSB_TO_B(mp, end_fsb) - offset;
A bug existed in the XFS reflink code between v5.1 and v5.5 in which the mapping for a COW IO was not trimmed to the mapping of the COW extent that was found. This resulted in a too-short copy, and corruption of other files which shared the original extent. (This happened only when extent size hints were set, which bypasses delalloc and led to this code path.) This was (inadvertently) fixed upstream with 36adcbace24e "xfs: fill out the srcmap in iomap_begin" and related patches which moved lots of this functionality to the iomap subsystem. Hence, this is a -stable only patch, targeted to fix this corruption vector without other major code changes. Fixes: 78f0cc9d55cb ("xfs: don't use delalloc extents for COW on files with extsize hints") Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- I've tested this with a targeted reproducer (in next email) as well as with xfstests. Stable folk, not sure how to send a "stable only" patch, or if that's even valid. Assuming you're willing to accept it, I would still like to have some formal Reviewed-by's from the xfs developer community before it gets merged. Big thanks to Darrick & Dave for letting me whine about this bug and offering suggestions for testing and ultimately, a patch to test.