Problem with BTRFS Cache Drive

January 18, 201511 yr

FINAL EDIT: Cables replaced, drive functional. No defect with system

I have a Samsung SSD I have formatted in BTRFS that I use as a cache drive. Every so often I am getting this message repeated (a lot):

Jan 17 18:20:51 AmyPond kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

and the cache becomes read-only, and everything crashes. I can shutdown safely (although I sometimes have to umount the drive manually) and restart and all is well.

But, it is standing between me and a wonderfully stressfree server. Looking for help.

I will grab any info anybody needs to help me fix it.

I am running b12 on the the system below. Everything else works great, and I only use the cache drive at the moment for appdata for dockers, including primarily Plex. Plugins on b12: VM manager, SNAP, Cachedirs, NTFS3G support

EDIT: here is a section of a syslog dump showing what's happening.

Jan 18 01:22:09 AmyPond kernel: BTRFS: error (device sdi1) in btrfs_commit_transaction:1888: errno=-5 IO failure (Error while writing out transaction)
Jan 18 01:22:09 AmyPond kernel: BTRFS info (device sdi1): forced readonly
Jan 18 01:22:09 AmyPond kernel: BTRFS warning (device sdi1): Skipping commit of aborted transaction.
Jan 18 01:22:09 AmyPond kernel: ------------[ cut here ]------------
Jan 18 01:22:09 AmyPond kernel: WARNING: CPU: 3 PID: 3814 at fs/btrfs/super.c:259 __btrfs_abort_transaction+0x4b/0xfb()
Jan 18 01:22:09 AmyPond kernel: BTRFS: Transaction aborted (error -5)
Jan 18 01:22:09 AmyPond kernel: Modules linked in: veth xt_nat kvm_intel kvm vhost_net vhost tun ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 nf_nat md_mod iptable_filter ip_tables i2c_i801 igb i2c_algo_bit e1000e ptp pps_core ahci libahci pata_marvell [last unloaded: md_mod]
Jan 18 01:22:09 AmyPond kernel: CPU: 3 PID: 3814 Comm: btrfs-transacti Not tainted 3.17.4-unRAID #1
Jan 18 01:22:09 AmyPond kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme6, BIOS P2.10 07/03/2013
Jan 18 01:22:09 AmyPond kernel: 0000000000000000 ffff8805edb07cc8 ffffffff815e4145 ffff8805edb07d10
Jan 18 01:22:09 AmyPond kernel: ffff8805edb07d00 ffffffff81040d0b ffffffff8129b7de 00000000fffffffb
Jan 18 01:22:09 AmyPond kernel: ffff880609441800 ffff880007c03f00 ffffffff8162aed0 ffff8805edb07d60
Jan 18 01:22:09 AmyPond kernel: Call Trace:
Jan 18 01:22:09 AmyPond kernel: [<ffffffff815e4145>] dump_stack+0x45/0x56
Jan 18 01:22:09 AmyPond kernel: [<ffffffff81040d0b>] warn_slowpath_common+0x75/0x8e
Jan 18 01:22:09 AmyPond kernel: [<ffffffff8129b7de>] ? __btrfs_abort_transaction+0x4b/0xfb
Jan 18 01:22:09 AmyPond kernel: [<ffffffff81040d6b>] warn_slowpath_fmt+0x47/0x49
Jan 18 01:22:09 AmyPond kernel: [<ffffffff8129b7de>] __btrfs_abort_transaction+0x4b/0xfb
Jan 18 01:22:09 AmyPond kernel: [<ffffffff812bf83e>] cleanup_transaction+0x80/0x21d
Jan 18 01:22:09 AmyPond kernel: [<ffffffff8106a2b4>] ? __wake_up_sync+0xd/0xd
Jan 18 01:22:09 AmyPond kernel: [<ffffffff812c074a>] btrfs_commit_transaction+0x857/0x86c
Jan 18 01:22:09 AmyPond kernel: [<ffffffff812bc7c8>] transaction_kthread+0xf7/0x1c8
Jan 18 01:22:09 AmyPond kernel: [<ffffffff812bc6d1>] ? btrfs_cleanup_transaction+0x45b/0x45b
Jan 18 01:22:09 AmyPond kernel: [<ffffffff8105617e>] kthread+0xd6/0xde
Jan 18 01:22:09 AmyPond kernel: [<ffffffff810560a8>] ? kthread_create_on_node+0x168/0x168
Jan 18 01:22:09 AmyPond kernel: [<ffffffff815e9efc>] ret_from_fork+0x7c/0xb0
Jan 18 01:22:09 AmyPond kernel: [<ffffffff810560a8>] ? kthread_create_on_node+0x168/0x168
Jan 18 01:22:09 AmyPond kernel: ---[ end trace 2892253777c4cd8f ]---
Jan 18 01:22:09 AmyPond kernel: BTRFS: error (device sdi1) in cleanup_transaction:1577: errno=-5 IO failure
Jan 18 01:22:09 AmyPond kernel: BTRFS info (device sdi1): delayed_refs has NO entry
Jan 18 01:22:10 AmyPond kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Quote

January 18, 201511 yr

If you have the time and space to backup your cache drive files, you could try switching your cache drive over to XFS.

Quote

January 18, 201511 yr

Author

If you have the time and space to backup your cache drive files, you could try switching your cache drive over to XFS.

I am backing up everything over the weekend since I expect whatever I have to do to fix it will be destructive to the current setup, and if there's not a solution I will switch to XFS. I was really hoping to move to a pool, so that's why I am hoping to find a solution.

EDIT: Switched to XFS. Think my drive has gone south. Unless someone can tell me that what's going on in that log is software, I'm leaning to hardware.

Quote

January 18, 201511 yr

I/O errors are almost always hardware failures, either the drive dying or a bad cable (power or SATA).

Quote

January 18, 201511 yr

Author

I/O errors are almost always hardware failures, either the drive dying or a bad cable (power or SATA).

I have re-cabled the drive and testing now. I am having a hard time believing a 6 month old SSD drive is failing. Bad/cheap cables, sure

Quote

January 18, 201511 yr

I/O errors are almost always hardware failures, either the drive dying or a bad cable (power or SATA).

Typically that's the situation, except when immature file systems, such as BTRFS, are involved.

Quote

January 20, 201511 yr

I/O errors are almost always hardware failures, either the drive dying or a bad cable (power or SATA).

Typically that's the situation, except when immature file systems, such as BTRFS, are involved.

BTRFS isn't immature. It's been around for quite some time and is the standard file system now for OpenSUSE, which says a lot.

The issues with BTRFS were specific to individual releases during the beta and issues around our implementation within a Docker Loopback image.

The general issues in the open source community with BTRFS are primarily surrounding specific CoW scenarios and rebalancing larger BTRFS raid types (5/6).

In our internal testing with BTRFS, we've found that single device BTRFS (for array devices) is fine. It's not causing issues and it adds Copy on Write, Snap Shots, and Checksum support for data stored inside. We also found that BTRFS RAID 1 groups (as in our cache pool) work fine as well. I've performed scrub and rebalancing operations on the cache pool while playing Titanfall in a Win 8.1 VM with a GPU passed through that was running off the same storage all at the same time... No issues.

If folks are having problems with BTRFS in unRAID, we need to know that, but here's a prime example of where the issue wasn't a BTRFS problem, but a cabling issue. Let's not jump all over BTRFS as the root cause of issues when we don't have enough info to diagnose just yet. Switching filesystems isn't something to suggest lightly, as it can take a massive amount of time and gain you nothing (as it would have here).

Quote

Problem with BTRFS Cache Drive

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)