7000+ BTRFS Errors


Recommended Posts

Unbeknownst to me, over the course of two hrs, my server experienced over 7000 btfs errors. I frankly have very little idea of what happened. I tried to stop the array as soon as I realized what was happening. Unraid after a min threw the "Retry unmounting user share(s)". I then booted into safe mode. I don't know where to go from here. Please help me out. I would greatly appreciate it. My very very long syslog is attached

errors.png

syslog-20210404-200343.txt

Link to comment

Problem was caused by one of the cache devices dropping offline:

 

Apr  4 18:50:50 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)
Apr  4 18:50:50 M1171-NAS kernel: ata1: limiting SATA link speed to 3.0 Gbps
Apr  4 18:50:50 M1171-NAS kernel: ata1: hard resetting link
Apr  4 18:50:55 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)
Apr  4 18:50:55 M1171-NAS kernel: ata1: reset failed, giving up
Apr  4 18:50:55 M1171-NAS kernel: ata1.00: disabled

 

There's a known issue with the onboard SATA controller in some Ryzen boards, look for a BIOS update, or use an add-on controller.

  • Like 1
Link to comment
Problem was caused by one of the cache devices dropping offline:
 
Apr  4 18:50:50 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)Apr  4 18:50:50 M1171-NAS kernel: ata1: limiting SATA link speed to 3.0 GbpsApr  4 18:50:50 M1171-NAS kernel: ata1: hard resetting linkApr  4 18:50:55 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)Apr  4 18:50:55 M1171-NAS kernel: ata1: reset failed, giving upApr  4 18:50:55 M1171-NAS kernel: ata1.00: disabled

 
There's a known issue with the onboard SATA controller in some Ryzen boards, look for a BIOS update, or use an add-on controller.


Ah hah, this is a relatively new board I have. My old board’s bios was up to date. This one may not be. Thank you very much. Ill look for an update
Link to comment
10 hours ago, JorgeB said:

Problem was caused by one of the cache devices dropping offline:

 


Apr  4 18:50:50 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)
Apr  4 18:50:50 M1171-NAS kernel: ata1: limiting SATA link speed to 3.0 Gbps
Apr  4 18:50:50 M1171-NAS kernel: ata1: hard resetting link
Apr  4 18:50:55 M1171-NAS kernel: ata1: softreset failed (1st FIS failed)
Apr  4 18:50:55 M1171-NAS kernel: ata1: reset failed, giving up
Apr  4 18:50:55 M1171-NAS kernel: ata1.00: disabled

 

There's a known issue with the onboard SATA controller in some Ryzen boards, look for a BIOS update, or use an add-on controller.

After updating the bios and rebooting into unraid, I check the syslog and am met with additional errors. I've stopped the array again. Do you know what my nvme is doing?

More in diagnostics

Quote

 

Apr  5 16:17:38 Jared-NAS emhttpd: shcmd (47): mount -t btrfs -o noatime,space_cache=v2,discard=async -U f3f35778-a797-4761-b345-d45e72821985 /mnt/cache
Apr  5 16:17:38 Jared-NAS kernel: BTRFS info (device nvme0n1p1): turning on async discard
Apr  5 16:17:38 Jared-NAS kernel: BTRFS info (device nvme0n1p1): using free space tree
Apr  5 16:17:38 Jared-NAS kernel: BTRFS info (device nvme0n1p1): has skinny extents

Apr  5 16:17:38 M1171-NAS kernel: BTRFS error (device nvme0n1p1): parent transid verify failed on 7546413809664 wanted 4193807 found 4193798
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413809664 (dev /dev/sdb1 sector 136423488)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413813760 (dev /dev/sdb1 sector 136423496)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413817856 (dev /dev/sdb1 sector 136423504)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413821952 (dev /dev/sdb1 sector 136423512)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS error (device nvme0n1p1): parent transid verify failed on 7546413826048 wanted 4193807 found 4193798
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413826048 (dev /dev/sdb1 sector 136423520)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413830144 (dev /dev/sdb1 sector 136423528)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413834240 (dev /dev/sdb1 sector 136423536)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413838336 (dev /dev/sdb1 sector 136423544)
Apr  5 16:17:38 M1171-NAS kernel: BTRFS error (device nvme0n1p1): parent transid verify failed on 7546413858816 wanted 4193807 found 4193798
Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): read error corrected: ino 0 off 7546413858816 (dev /dev/sdb1 sector 136423584)

 

 

M1171-nas-diagnostics-20210405-1626.zip

Link to comment
15 minutes ago, mason1171 said:

After scrubbing the cache, it looks like all errors were corrected

Yep.

 

Array should be fine, at last mount all disks were clean, btrfs will show any accumulated errors at mount time (unless you clear them), e.g, this was form the pool:

 

Apr  5 16:17:38 M1171-NAS kernel: BTRFS info (device nvme0n1p1): bdev /dev/sdb1 errs: wr 283405, rd 112295, flush 0, corrupt 0, gen 0

 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.