• [6.10rc4] BTRFS Cache Errors


    nickp85
    • Closed

    So I updated to v6.10rc4 a couple days ago and am noticing BTRFS errors in my cache log. I thought perhaps my cache was going bad so I rebuilt it using the standard method of using mover to get everything off, reformat, and put everything back with mover.

     

    I was not initially seeing read/corrupt errors last night after doing this but am again today. Cache is operating fine it seems as Docker and VMs are working but I am concerned about what I see in the logs.

     

    The 2 NVME I'm using are Samsung 960 Pro 512GB and are a little over 4 years old but only 152TBW, they are warrantied for 5yr/400TBW.

     

    Not sure if this is an Unraid issue since it just started after rc4 and rebuilding the cache did not clear the errors. Attaching new diagnostics.

    nicknas2-diagnostics-20220327-1440.zip




    User Feedback

    Recommended Comments

    This is likely a general support issue, btrfs is detecting data corruption on both devices:

     

    Mar 25 23:20:21 nicknas2 kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 1872, gen 0
    Mar 25 23:20:21 nicknas2 kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 1594, gen 0

     

    Start by running memtest.

    Link to comment
    On 3/28/2022 at 4:44 AM, JorgeB said:

    Start by running memtest.

    Going on 22 hours, 14 passes done and zero errors. This machine is definitely stable.

     

    the corruption errors are observed for both nvme drives using the terminal command to print them out by device so I’m doubting both drives are bad.

     

    something fishy is going on. I did not have this issue until either rc3 or rc4. Would not have noticed if I hadn’t looked at the disk log randomly while I was in the console one day.

    Edited by nickp85
    Link to comment

    One thing to note is that the memtest86 that comes with UNRAID is an old version* - if you want the most recent, you have to download it from https://www.memtest86.com/ and create your own boot flash. Note that this version restriction placed by memtest86.

     

    * At least this was the case the last time I used the tool, YMMV

    Edited by jbartlett
    Link to comment

    I am booting with UEFI so booted off a separate stick which has a bootable latest version of memtest86 on it

    Edited by nickp85
    Link to comment

    Unlikely that the NVMe devices are the problem, if no RAM errors were found for now run a scrub to see if all errors are correctable, if they are reset see here to filesystem stats and how to monitor the pool for further errors, if they aren't all correctable backup and reformat the pool then also monitor.

    Link to comment

    Since you started a new thread in the general support forum and that's likely the best place for this let's continue there, we can reopen this in the future if needed. 

     

     

    Link to comment


    Guest
    This is now closed for further comments

  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.