Reflinks for XFS are available in Fedora 27, so you no longer need to pull and compile xfsprogs from git.

To leverage reflinks in XFS, you need to create a file system with the reflink=1 flag.

[[email protected] mnt]# mkfs.xfs -m reflink=1 filesystem

In my example I just created a file and mounted it on a loop device.

[[email protected] mnt]# mkfs.xfs -m reflink=1 test.img

Then I’ll mount it

[[email protected] mnt]# mount -o loop test.img /mnt

We can now create a file with some random information to copy.

[[email protected] mnt]# dd if=/dev/urandom of=test bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.594451 s, 176 MB/s

Df shows us 140M used:

/dev/loop0              1014M  140M  875M  14% /mnt

So let’s copy the file with reflinks enabled:

[[email protected] mnt]# cp -v --reflink=always test testfile
'test' -> 'testfile'
[[email protected] mnt]# ls -lsh
total 200M
100M -rw-r--r--. 1 root root 100M Mar  4 18:40 test
100M -rw-r--r--. 1 root root 100M Mar  4 18:43 testfile

So we can see both copies of the file are 100M but df shows the same amount of space used:

/dev/loop0              1014M  140M  875M  14% /mnt

So this is helpful for copying data, but what about existing data? For existing data we can use a tool like duperemove. You can find it here.

With duperemove we can do out of band deduplication. I’ll make two more normal copies of our test file:

[[email protected] mnt]# cp test test{2,3}
[[email protected] mnt]# ls -lsh
total 300M
100M -rw-r--r--. 1 root root 100M Mar  4 18:40 test
100M -rw-r--r--. 1 root root 100M Mar  4 18:47 test2
100M -rw-r--r--. 1 root root 100M Mar  4 18:47 test3

Df shows 340M used:

/dev/loop0              1014M  340M  675M  34% /mnt

So let’s run duperemove against the directory:

[[email protected] mnt]# duperemove -hdr --hashfile=/tmp/test.hash /mnt
Kernel processed data (excludes target files): 400.0M
Comparison of extent info shows a net change in shared extents of: 300.0M
[[email protected] mnt]# ls -lsh /mnt
total 300M
100M -rw-r--r--. 1 root root 100M Mar  4 18:40 test
100M -rw-r--r--. 1 root root 100M Mar  4 18:47 test2
100M -rw-r--r--. 1 root root 100M Mar  4 18:47 test3

And here’s our df output:

/dev/loop0              1014M  140M  875M  14% /mnt

We’re back to where we started.