fespinoza831 Posted February 10, 2023 Share Posted February 10, 2023 i bought a upgrade for one of my drives (larger drive). Replaced and a few days ago now this morning my new drive is disabled (disk 9). plexserver-diagnostics-20230210-0716.zip Quote Link to comment
trurl Posted February 10, 2023 Share Posted February 10, 2023 Perhaps more seriously, you have checksum errors on cache2. Have you done memtest lately? Your syslog is so full of that problem I had to filter it all out to try to find something about disk9. I didn't see anything, probably it was already disabled before reboot. Diagnostics contains current syslog so no way to see what happened before reboot. First thing would be to run memtest. Builtin memtest won't boot UEFI. Better if you download and boot into memtest86 anyway. Quote Link to comment
fespinoza831 Posted February 10, 2023 Author Share Posted February 10, 2023 Thanks, doing this now. Quote Link to comment
fespinoza831 Posted February 10, 2023 Author Share Posted February 10, 2023 it passed Quote Link to comment
trurl Posted February 10, 2023 Share Posted February 10, 2023 Disable Docker and VM Manager in Settings and leave them disabled until you get cache fixed. See if running scrub will help anything. Then reboot and post new diagnostics so we can at least get clean logs. As for your disabled disk, possibly caused by bad connection. You will have to rebuild it after double checking all connections, both ends, SATA and power, including splitters. Quote Link to comment
fespinoza831 Posted February 11, 2023 Author Share Posted February 11, 2023 I disabled Docker, i have no VMs, ran the srub on cache restarted and pulled new diadnostics file. plexserver-diagnostics-20230210-1710.zip Quote Link to comment
trurl Posted February 11, 2023 Share Posted February 11, 2023 Still seeing some indications of corruption on cache Feb 10 17:09:55 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdg1 errs: wr 1874605108, rd 103989329, flush 11530688, corrupt 354867034, gen 313704 Feb 10 17:09:55 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdf1 errs: wr 435937566, rd 3714670, flush 3150566, corrupt 294517547, gen 185192 Not sure scrub helped, since this was shortly after reboot, could be it would continue to log a lot of these. I do use btrfs for my cache pool, but I have no experience with fixing problems with it. I'm going to see if @JorgeB will take a look and suggest how to proceed. It is way past bedtime in his part of the world so it may be several hours before he can respond. In the meantime, it won't hurt anything to go ahead and start rebuilding that array disk. 4 hours ago, trurl said: As for your disabled disk, possibly caused by bad connection. You will have to rebuild it after double checking all connections, both ends, SATA and power, including splitters. https://wiki.unraid.net/Manual/Storage_Management#Rebuilding_a_drive_onto_itself Quote Link to comment
JorgeB Posted February 11, 2023 Share Posted February 11, 2023 Filesystem stats show that both pool devices dropped offline in the past, possibly one at a time, it can still be a problem if the other one was not in sync from dropping before, run a correcting srub and post that output together with new diags. Quote Link to comment
fespinoza831 Posted February 12, 2023 Author Share Posted February 12, 2023 rebuilt the array disk. plexserver-diagnostics-20230211-1946.zip Quote Link to comment
JorgeB Posted February 12, 2023 Share Posted February 12, 2023 You didn't post the scrub output, also log is full of errors for cache2, check/replace cables. Quote Link to comment
fespinoza831 Posted February 12, 2023 Author Share Posted February 12, 2023 (edited) Swapped out some of the sata cables and verified every single cable. 2nd cache drive sata cable was actuall half way in. plexserver-diagnostics-20230212-1539.zip Edited February 12, 2023 by fespinoza831 Quote Link to comment
JorgeB Posted February 13, 2023 Share Posted February 13, 2023 All errors were corrected so it should be OK for now, assuming no NOCOW shares exist, see here for more info and better pool monitoring. Quote Link to comment
fespinoza831 Posted February 13, 2023 Author Share Posted February 13, 2023 Whats NOCOW, I also notice that this cache corruption tends to happend when I manually add file from my windows pc. Quote Link to comment
fespinoza831 Posted February 13, 2023 Author Share Posted February 13, 2023 also notice that if I try and run the srub now it shows Quote Link to comment
fespinoza831 Posted February 13, 2023 Author Share Posted February 13, 2023 also see this often Quote Link to comment
fespinoza831 Posted February 13, 2023 Author Share Posted February 13, 2023 Sorry though I had. plexserver-diagnostics-20230213-1654.zip Quote Link to comment
JorgeB Posted February 14, 2023 Share Posted February 14, 2023 Feb 12 15:38:11 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdg1 errs: wr 1874605108, rd 103989329, flush 11530688, corrupt 354867034, gen 313704 Feb 12 15:38:11 PlexServer kernel: BTRFS info (device sdg1): bdev /dev/sdf1 errs: wr 466071269, rd 7921894, flush 3554463, corrupt 387371438, gen 243232 Feb 13 06:59:20 PlexServer kernel: BTRFS info (device sdf1): bdev /dev/sdg1 errs: wr 2005518254, rd 111584553, flush 11710630, corrupt 419236637, gen 348739 Feb 13 06:59:20 PlexServer kernel: BTRFS info (device sdf1): bdev /dev/sdf1 errs: wr 466071269, rd 7921894, flush 3554463, corrupt 387371438, gen 243232 Comparing both stats you can se the new write errors for sdg, meaning it dropped offline again between those times, if the scrub cannot correct all errors you need to backup what you can and restore the pool, but more importantly need to fix the device dropping issue, or there will be more issues in the future. Quote Link to comment
fespinoza831 Posted February 14, 2023 Author Share Posted February 14, 2023 so is it two different devices dropping offline, could this be a PSU issue? im only using a 650 watt psu. Quote Link to comment
Solution JorgeB Posted February 14, 2023 Solution Share Posted February 14, 2023 SSDs don't need much power, could be a bad power connection or bad PSU, SATA cables/controller are also an option. Quote Link to comment
fespinoza831 Posted February 14, 2023 Author Share Posted February 14, 2023 There are other hard drives connected to the same PSU cable (before it and after it). the sata cables are new. will try other cables and other ports. so 650 psu is ok for this many hard drives? Quote Link to comment
JorgeB Posted February 15, 2023 Share Posted February 15, 2023 If it's a good quality PSU and working correctly it's enough. Quote Link to comment
fespinoza831 Posted February 18, 2023 Author Share Posted February 18, 2023 Thank you very much, its been up and runnign for 3 days now no errors was a bad sata cable. 1 Quote Link to comment
fespinoza831 Posted March 2, 2023 Author Share Posted March 2, 2023 The issue came back...been happening for the past two days in a row plexserver-diagnostics-20230302-1630.zip Quote Link to comment
JorgeB Posted March 3, 2023 Share Posted March 3, 2023 Still looks like a cable/connection problem, if those were already replaced it could be a device problem. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.