• 6.11.3 -> 6.11.4 multiple crashes - see log and diags


    bigbangus
    • Solved

    Upgraded to 6.11.4 from 6.11.3.

     

    Two crashes were observed soon after:

     

    1) Right after upgrading, I tried to launch my WIN10VM while the dockers were loading. The system froze and my win10vm was unresponsive at login screen. All my dockers were down and my unraid GUI was unreachable.

    Nov 19 17:01:17 unraid kernel: BUG: Bad page state in process qemu-system-x86  pfn:7559e9
    Nov 19 17:01:17 unraid kernel: page:00000000c01ab2ff refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7559e9
    Nov 19 17:01:17 unraid kernel: flags: 0x2ffff0000000008(dirty|node=0|zone=2|lastcpupid=0xffff)
    Nov 19 17:01:17 unraid kernel: raw: 02ffff0000000008 ffffea001d567a48 ffffea001d567a48 0000000000000000
    Nov 19 17:01:17 unraid kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
    Nov 19 17:01:17 unraid kernel: page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set

     

    2) During morning CA backup while my Win10VM was running. Same symptoms as before.

    Nov 20 04:09:55 unraid kernel: BUG: kernel NULL pointer dereference, address: 0000000000000088
    Nov 20 04:09:55 unraid kernel: #PF: supervisor read access in kernel mode
    Nov 20 04:09:55 unraid kernel: #PF: error_code(0x0000) - not-present page
    Nov 20 04:09:55 unraid kernel: PGD 0 P4D 0 
    Nov 20 04:09:55 unraid kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
    Nov 20 04:09:55 unraid kernel: CPU: 5 PID: 263 Comm: kswapd0 Tainted: P    B      O      5.19.17-Unraid #2
    Nov 20 04:09:55 unraid kernel: Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P2.30 02/24/2022
    Nov 20 04:09:55 unraid kernel: RIP: 0010:mem_cgroup_lruvec+0x35/0x4c

     

    The consistent trend is that it crashed during docker loading I guess once on startup and once during CA backup. Both times with the VM on.

     

    pfsense-unraid2022-11-19.log pfsense-unraid2022-11-20.log unraidnas-diagnostics-20221120-0657.zip




    User Feedback

    Recommended Comments

    OK now crashing on 6.11.3 when I launched my VM. Now I am a bit confused, especially since I've been on 6.11.3 for over a week with my win10 running no issue.

     

    See attached for crash log and diags.

     

    Nov 20 07:35:24 unraid kernel: BUG: Bad page state in process uwsgi  pfn:4b47cc
    Nov 20 07:35:24 unraid kernel: page:000000000856407b refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x4b47cc
    Nov 20 07:35:24 unraid kernel: flags: 0x2ffff0000000008(dirty|node=0|zone=2|lastcpupid=0xffff)
    Nov 20 07:35:24 unraid kernel: raw: 02ffff0000000008 dead000000000100 dead000000000122 0000000000000000
    Nov 20 07:35:24 unraid kernel: raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
    Nov 20 07:35:24 unraid kernel: page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
    Nov 20 07:35:24 unraid kernel: Modules linked in: af_packet nvidia_uvm(PO) xt_nat veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax md_mod nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls wmi_bmof nvidia_drm(PO) nvidia_modeset(PO) edac_mce_amd edac_core nvidia(PO) kvm_amd drm_kms_helper kvm drm backlight crct10dif_pclmul i2c_piix4 crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl r8169 nvme i2c_core k10temp ccp syscopyarea ahci sysfillrect nvme_core realtek joydev sysimgblt libahci fb_sys_fops wmi tpm_crb tpm_tis tpm_tis_core tpm acpi_cpufreq button unix
    Nov 20 07:35:24 unraid kernel: CPU: 1 PID: 25802 Comm: uwsgi Tainted: P    B      O      5.19.17-Unraid #2
    Nov 20 07:35:24 unraid kernel: Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P2.30 02/24/2022

     

    pfsense-unraid2022-11-20_6.11.3.log unraidnas-diagnostics-20221120-0833.zip

    Link to comment

    I am no expert, and will leave it to others to look at your diags and comment on anything they find there.

     

    I know this has nothing to do with VM's, but your description of the crash behaviour sounds like what several users are reporting since upgrading to 6.11.x,.  They have narrowed it down to a conflict within libtorrent libraries and the latest OS kernel, and this has helped me.   If you are running a torrent docker this may apply to you.

     

    Have a look here, and especially to binhex's comments on page 6: 

    https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-r2153/?page=6&tab=comments

     

    When I turn off binhex qbittorrentvpn docker, the crashes stop.  I have downgraded that docker and am watching to see if it makes a difference.

     

    III_D

    unraid 6.11.3

    Repository: binhex/arch-qbittorrentvpn:4.3.9-2-01

     

    • Like 1
    Link to comment

    I have the same issue, or at least seems the same as I haven't had a chance to dig into it. Mine didn't even get to the login screen though. I boot my VM from an nvme drive that is passed through. I recently added some disks but have been frustrated that doing so has screwed up the boot order and the BIOS doesn't keep changes I make. I've just been dealing with it for now but I updated and saw new boot order options so but the NVME drive as 1 and the others blank. I try to start the VM and after a few minutes the entire server locks up, GUI and ssh are unresponsive and the only thing to do was a hard shutdown. Tried it again, same thing. I didn't try putting the boot order back yet because I am waiting for the parity check to finish now.

    Edited by bobbintb
    Link to comment
    26 minutes ago, bobbintb said:

    I have the same issue, or at least seems the same as I haven't had a chance to dig into it. 

     

    yeah def provide diags and logs on a dedicated thread if possible to the ops to solve it. otherwise it's not possible to know if the issues are related.

    Link to comment

    Had another crash with the VM off, then decided to leave off compreface-gpu and double-take dockers knowing these guys are heavy lifters (really no evidence says to turn them off, just a hunch).

     

    Then left the house, came home to my zigbee2mqtt docker dead. The log said something about write-only only. Looking at the pfsense remote syslog (see attached):

     

    Nov 20 12:30:34 unraid kernel: BTRFS critical (device dm-3): corrupt leaf: root=7 block=3573797388288 slot=263, unaligned key offset for csum item, have 3069387368448 should be aligned to 4096
    Nov 20 12:30:34 unraid kernel: BTRFS info (device dm-3): leaf 3573797388288 gen 2132420 total ptrs 329 free space 4214 owner 7
    Nov 20 12:30:34 unraid kernel: 	item 0 key (18446744073709551606 128 3069384982528) itemoff 16271 itemsize 12
    Nov 20 12:30:34 unraid kernel: 	item 1 key (18446744073709551606 128 3069384994816) itemoff 16267 itemsize 4
    Nov 20 12:30:34 unraid kernel: 	item 2 key (18446744073709551606 128 3069384998912) itemoff 16259 itemsize 8
    Nov 20 12:30:34 unraid kernel: 	item 3 key (18446744073709551606 128 3069385007104) itemoff 16243 itemsize 16

     

    Then:

    Nov 20 12:30:34 unraid kernel: BTRFS error (device dm-3): block=3573797388288 write time tree block corruption detected
    Nov 20 12:30:34 unraid kernel: BTRFS: error (device dm-3) in btrfs_commit_transaction:2418: errno=-5 IO failure (Error while writing out transaction)
    Nov 20 12:30:34 unraid kernel: BTRFS info (device dm-3: state E): forced readonly
    Nov 20 12:30:34 unraid kernel: BTRFS warning (device dm-3: state E): Skipping commit of aborted transaction.
    Nov 20 12:30:34 unraid kernel: BTRFS: error (device dm-3: state EA) in cleanup_transaction:1982: errno=-5 IO failure
    Nov 20 12:30:34 unraid kernel: I/O error, dev loop2, sector 1953024 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0

     

    so I noticed 6.11.5 what the heck why not I upgrade the OS to that. also in pure desperation I removed the following append from my flash:

    nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

    this was suggested in the log from a previous crash this year.

     

    Do I just have a shitty nvme here?

    pfsense-unraid2022-11-20_morecrash.log

    Link to comment

    You are having multiple apps crashing and btrfs is detecting data corruption there's also this:

     

    BTRFS error (device dm-3): block=3573797388288 write time tree block corruption detected

     

    Which usually means bad RAM, or other kernel memory corruption, start by running memtest.

     

     

     

     

    • Thanks 1
    Link to comment
    3 hours ago, JorgeB said:

    Which usually means bad RAM, or other kernel memory corruption, start by running memtest.

     

    After several more crashes not reported here yesterday, I pulled the 2nd half of my RAM out and now the server runs 100% for 12 hours. Thank you!

     

    Do you think the bad RAM could be responsible for the previous nvme / usb device crashes I've had this year:

     

    Edited by bigbangus
    Link to comment

    Server still going strong, but noticed there was an error this morning that I overlooked. @JorgeB Is there something I need to dig into to?

     

    Nov 21 04:17:20 UnraidNAS CA Backup/Restore: Verifying backup
    Nov 21 04:17:20 UnraidNAS CA Backup/Restore: Using command: cd '/mnt/user/appdata/' && /usr/bin/tar --diff -C '/mnt/user/appdata/' -af '/mnt/user/backups/appdata/[email protected]/CA_backup.tar.gz' > /var/lib/docker/unraid/ca.backup2.datastore/appdata_backup.log & echo $! > /tmp/ca.backup2/tempFiles/verifyInProgress
    Nov 21 04:18:00 UnraidNAS kernel: BTRFS warning (device dm-3): csum failed root 5 ino 17390531 off 327680 csum 0x32c11707 expected csum 0x43348dfc mirror 1
    Nov 21 04:18:00 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0

     

    Should I do a BTRFS scrub or something?

    unraidnas-diagnostics-20221121-1127.zip

    Edited by bigbangus
    Link to comment
    2 minutes ago, bigbangus said:

    Should I do a BTRFS scrub

    Yes, and if it finds corruption look at the syslog for a list of affected file(s) and delete/restore from backups.

    • Like 1
    Link to comment

    ok scrub in process thanks. I did a inode find and deleted the affected file prior.

     

    root@UnraidNAS:~# find /mnt/cache -inum 17390531
    /mnt/cache/appdata/binhex-radarr/logs/radarr.39.txt
    root@UnraidNAS:~# cd /mnt/cache/appdata/binhex-radarr/logs/
    root@UnraidNAS:/mnt/cache/appdata/binhex-radarr/logs# rm radarr.39.txt 
    root@UnraidNAS:/mnt/cache/appdata/binhex-radarr/logs# 

     

    Link to comment

    not sure what to make of this. SCRUB report has a 0/0/0 error summary, but syslog is littered with warnings and errors

     

    SCRUB report:

    UUID:             8fc31d8d-1584-4ae3-a473-1894b03996f3
    Scrub started:    Mon Nov 21 11:38:48 2022
    Status:           finished
    Duration:         0:11:14
    Total to scrub:   492.07GiB
    Rate:             747.58MiB/s
    Error summary:    read=10 csum=6
      Corrected:      0
      Uncorrectable:  0
      Unverified:     0

     

    Nov 21 11:41:20 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3079392104448 on dev /dev/mapper/nvme0n1p1, physical 67479179264, root 5, inode 31329563, offset 8192, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Cache/PhotoTranscoder/97/976adb790e407d0bd4ba7d0dd9809d2dc58d3244.jpg)
    Nov 21 11:41:20 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
    Nov 21 11:41:33 UnraidNAS autofan: Highest disk temp is 38C, adjusting fan speed from: 190 (74% @ 1115rpm) to: 210 (82% @ 1113rpm)
    Nov 21 11:42:13 UnraidNAS autofan: Highest disk temp is 37C, adjusting fan speed from: 210 (82% @ 1118rpm) to: 190 (74% @ 1034rpm)
    Nov 21 11:43:18 UnraidNAS autofan: Highest disk temp is 38C, adjusting fan speed from: 190 (74% @ 1028rpm) to: 210 (82% @ 1114rpm)
    Nov 21 11:44:09 UnraidNAS autofan: Highest disk temp is 38C, adjusting fan speed from: 190 (74% @ 2014rpm) to: 210 (82% @ 1965rpm)
    Nov 21 11:45:35 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3256795521024 on dev /dev/mapper/nvme0n1p1, physical 253472530432, root 5, inode 31329650, offset 499712, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Cache/PhotoTranscoder/e4/e42eaed652dadd123dbe4809b217b70a37fa677d.jpg)
    Nov 21 11:45:35 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
    Nov 21 11:45:54 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3280635400192 on dev /dev/mapper/nvme0n1p1, physical 277312409600, root 5, inode 11017814, offset 253952, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Metadata/TV Shows/7/48e11ccd198ef58ee9cd436a4c8e896a5032f03.bundle/Contents/_combined/posters/tv.plex.agents.series_f85b35997126ad291e407c43271f62cee4ffcef1)
    Nov 21 11:45:54 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
    Nov 21 11:46:39 UnraidNAS autofan: Highest disk temp is 37C, adjusting fan speed from: 210 (82% @ 1104rpm) to: 190 (74% @ 1116rpm)
    Nov 21 11:47:35 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3360888418304 on dev /dev/mapper/nvme0n1p1, physical 365081620480, root 5, inode 31329467, offset 827392, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Cache/PhotoTranscoder/36/3675b4617d307130c996a8c288e4ab33ad400e72.jpg)
    Nov 21 11:47:35 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3360887865344 on dev /dev/mapper/nvme0n1p1, physical 365081067520, root 5, inode 31329467, offset 274432, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Cache/PhotoTranscoder/36/3675b4617d307130c996a8c288e4ab33ad400e72.jpg)
    Nov 21 11:47:35 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
    Nov 21 11:47:35 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
    Nov 21 11:47:44 UnraidNAS autofan: Highest disk temp is 38C, adjusting fan speed from: 190 (74% @ 1116rpm) to: 210 (82% @ 1112rpm)
    Nov 21 11:48:45 UnraidNAS kernel: BTRFS warning (device dm-3): checksum error at logical 3406798528512 on dev /dev/mapper/nvme0n1p1, physical 422802890752, root 5, inode 31329505, offset 3088384, length 4096, links 1 (path: appdata/binhex-plex/Plex Media Server/Cache/PhotoTranscoder/bb/bbc8674fca63d0f667f64db95b53fd4f84bb4b68.jpg)
    Nov 21 11:48:45 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861063264, 1024 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861063264 op 0x0:(READ) flags 0x4000 phys_seg 94 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861069408, 1024 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861069408 op 0x0:(READ) flags 0x4000 phys_seg 20 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861133560, 768 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861133560 op 0x0:(READ) flags 0x4000 phys_seg 96 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861063336, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861063336 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418399789056 on dev /dev/mapper/nvme0n1p1, physical 440846602240, root 5, inode 1304841, offset 31049420800, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 1, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861069512, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861069512 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418402951168 on dev /dev/mapper/nvme0n1p1, physical 440849764352, root 5, inode 1304841, offset 31052582912, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 2, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861063368, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861063368 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418399805440 on dev /dev/mapper/nvme0n1p1, physical 440846618624, root 5, inode 1304841, offset 31049437184, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 3, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861069520, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861069520 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418402955264 on dev /dev/mapper/nvme0n1p1, physical 440849768448, root 5, inode 1304841, offset 31052587008, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 4, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861134120, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861134120 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418436030464 on dev /dev/mapper/nvme0n1p1, physical 440882843648, root 5, inode 1304841, offset 23452639232, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 5, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861063376, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861063376 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418399809536 on dev /dev/mapper/nvme0n1p1, physical 440846622720, root 5, inode 1304841, offset 31049441280, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 6, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: nvme0n1: I/O Cmd(0x2) @ LBA 861134136, 8 blocks, I/O Error (sct 0x2 / sc 0x81) MORE DNR 
    Nov 21 11:48:56 UnraidNAS kernel: critical medium error, dev nvme0n1, sector 861134136 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418436038656 on dev /dev/mapper/nvme0n1p1, physical 440882851840, root 5, inode 1304841, offset 23452647424, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 7, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418399813632 on dev /dev/mapper/nvme0n1p1, physical 440846626816, root 5, inode 1304841, offset 31049445376, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 8, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418436042752 on dev /dev/mapper/nvme0n1p1, physical 440882855936, root 5, inode 1304841, offset 23452651520, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 9, flush 0, corrupt 8, gen 0
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS warning (device dm-3): i/o error at logical 3418399817728 on dev /dev/mapper/nvme0n1p1, physical 440846630912, root 5, inode 1304841, offset 31049449472, length 4096, links 1 (path: domains/Ubuntu/vdisk1.img)
    Nov 21 11:48:56 UnraidNAS kernel: BTRFS error (device dm-3): bdev /dev/mapper/nvme0n1p1 errs: wr 0, rd 10, flush 0, corrupt 8, gen 0
    Nov 21 11:50:02 UnraidNAS kernel: BTRFS info (device dm-3): scrub: finished on devid 1 with status: 0

     

    Edited by bigbangus
    Link to comment

    OK I managed to get out OK with just deleting what was affected. Re-ran the scrub and now I see no errors or warning in my log and an exit code 0

     

    Nov 21 13:31:02 UnraidNAS kernel: BTRFS info (device dm-3): scrub: finished on devid 1 with status: 0

     

    However, during lunch I did have another error in my syslog making me wonder if the remaining RAM is still bad or if I need to troubleshoot components like the motherboard or peripherals. Or slow my RAM clock down. Right now I'm 3200MHz which agrees with the B550M Pro4 manual for 2 slots of RAM. Also back in OS 6.9.x I was running this way for a long time with no issues on those same two sticks.

     

    Nov 21 12:42:38 UnraidNAS kernel: BUG: Bad page state in process shfs  pfn:1befce
    Nov 21 12:42:38 UnraidNAS kernel: page:000000009ad4d718 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1befce

     

     

    unraidnas-diagnostics-20221121-1337.zip

    Edited by bigbangus
    Link to comment

    Ran memtest, got random errors. Noticed I has the two 16GB memory sticks in the A1/B1 slots instead of the correct A2/B2 slots so the XMP profile wasn't stable.

     

    Switched them to the correct A2/B2 slots @ 3200MHz and it has been running for days without a single error. Thank you for all the help. 

    • Like 1
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.