February 10, 20233 yr i bought a upgrade for one of my drives (larger drive). Replaced and a few days ago now this morning my new drive is disabled (disk 9). plexserver-diagnostics-20230210-0716.zip
February 10, 20233 yr Community Expert Perhaps more seriously, you have checksum errors on cache2. Have you done memtest lately? Your syslog is so full of that problem I had to filter it all out to try to find something about disk9. I didn't see anything, probably it was already disabled before reboot. Diagnostics contains current syslog so no way to see what happened before reboot. First thing would be to run memtest. Builtin memtest won't boot UEFI. Better if you download and boot into memtest86 anyway.
February 10, 20233 yr Community Expert Disable Docker and VM Manager in Settings and leave them disabled until you get cache fixed. See if running scrub will help anything. Then reboot and post new diagnostics so we can at least get clean logs. As for your disabled disk, possibly caused by bad connection. You will have to rebuild it after double checking all connections, both ends, SATA and power, including splitters.
February 11, 20233 yr Author I disabled Docker, i have no VMs, ran the srub on cache restarted and pulled new diadnostics file. plexserver-diagnostics-20230210-1710.zip
February 11, 20233 yr Community Expert Still seeing some indications of corruption on cache Feb 10 17:09:55 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdg1 errs: wr 1874605108, rd 103989329, flush 11530688, corrupt 354867034, gen 313704 Feb 10 17:09:55 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdf1 errs: wr 435937566, rd 3714670, flush 3150566, corrupt 294517547, gen 185192 Not sure scrub helped, since this was shortly after reboot, could be it would continue to log a lot of these. I do use btrfs for my cache pool, but I have no experience with fixing problems with it. I'm going to see if @JorgeB will take a look and suggest how to proceed. It is way past bedtime in his part of the world so it may be several hours before he can respond. In the meantime, it won't hurt anything to go ahead and start rebuilding that array disk. 4 hours ago, trurl said: As for your disabled disk, possibly caused by bad connection. You will have to rebuild it after double checking all connections, both ends, SATA and power, including splitters. https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself
February 11, 20233 yr Community Expert Filesystem stats show that both pool devices dropped offline in the past, possibly one at a time, it can still be a problem if the other one was not in sync from dropping before, run a correcting srub and post that output together with new diags.
February 12, 20233 yr Community Expert You didn't post the scrub output, also log is full of errors for cache2, check/replace cables.
February 12, 20233 yr Author Swapped out some of the sata cables and verified every single cable. 2nd cache drive sata cable was actuall half way in. plexserver-diagnostics-20230212-1539.zip Edited February 12, 20233 yr by fespinoza831
February 13, 20233 yr Community Expert All errors were corrected so it should be OK for now, assuming no NOCOW shares exist, see here for more info and better pool monitoring.
February 13, 20233 yr Author Whats NOCOW, I also notice that this cache corruption tends to happend when I manually add file from my windows pc.
February 14, 20233 yr Community Expert Feb 12 15:38:11 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdg1 errs: wr 1874605108, rd 103989329, flush 11530688, corrupt 354867034, gen 313704 Feb 12 15:38:11 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdf1 errs: wr 466071269, rd 7921894, flush 3554463, corrupt 387371438, gen 243232 Feb 13 06:59:20 PlexServer kernel: BTRFS info (device sdf1): bdev /dev/sdg1 errs: wr 2005518254, rd 111584553, flush 11710630, corrupt 419236637, gen 348739 Feb 13 06:59:20 PlexServer kernel: BTRFS info (device sdf1): bdev /dev/sdf1 errs: wr 466071269, rd 7921894, flush 3554463, corrupt 387371438, gen 243232 Comparing both stats you can se the new write errors for sdg, meaning it dropped offline again between those times, if the scrub cannot correct all errors you need to backup what you can and restore the pool, but more importantly need to fix the device dropping issue, or there will be more issues in the future.
February 14, 20233 yr Author so is it two different devices dropping offline, could this be a PSU issue? im only using a 650 watt psu.
February 14, 20233 yr Community Expert Solution SSDs don't need much power, could be a bad power connection or bad PSU, SATA cables/controller are also an option.
February 14, 20233 yr Author There are other hard drives connected to the same PSU cable (before it and after it). the sata cables are new. will try other cables and other ports. so 650 psu is ok for this many hard drives?
February 18, 20233 yr Author Thank you very much, its been up and runnign for 3 days now no errors was a bad sata cable.
March 2, 20233 yr Author The issue came back...been happening for the past two days in a row plexserver-diagnostics-20230302-1630.zip
March 3, 20233 yr Community Expert Still looks like a cable/connection problem, if those were already replaced it could be a device problem.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.