dziewuliz Posted March 11, 2023 Share Posted March 11, 2023 Recently, after upgrading a motherboard and cpu, same day one of the disks got disabled. I decided to do a full rebuild, and overnight it failed again, with now two disks being emulated. SMART check looks clean, so I assume it could have been the cables? Any recommended steps to take next? tower-diagnostics-20230311-1837.zip Quote Link to comment
JorgeB Posted March 11, 2023 Share Posted March 11, 2023 Single parity can only emulate one disk, you can force enable disk2, or even both, since they look healthy, do you know if anything was written to disk3 after it got disabled? Quote Link to comment
dziewuliz Posted March 11, 2023 Author Share Posted March 11, 2023 (edited) There's a chance there were some writing activity while the rebuilding was happening... Went ahead and started array in maintenance mode, and finished data-rebuild, successfully. Now, when I toggled array without Disk2, trying to stop array results in "Array Stopping • Retry unmounting disk share(s)...", and logs: Mar 11 21:57:03 Tower root: umount: /mnt/disk1: target is busy. Mar 11 21:57:03 Tower emhttpd: shcmd (9142): exit status: 32 Mar 11 21:57:03 Tower emhttpd: Retry unmounting disk share(s)... Rebooted, selected the Disk2 again, started array, now it's reconstructing with estimated speed: 3.2 MB/sec, estimated to finish in 60 days 😀 Never had such long estimate before, typically managed to complete parity check within 1-2 days. Should I swap to another drive? Speed issue must be related to newly connected LSI SATA SAS 9211-8i Edited March 11, 2023 by dziewuliz Quote Link to comment
dziewuliz Posted March 11, 2023 Author Share Posted March 11, 2023 tower-diagnostics-20230311-2324.zip The speed issue seems to be affected to drives connected to newly installed LSI SATA SAS 9211-8i (SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]). Quote Link to comment
trurl Posted March 11, 2023 Share Posted March 11, 2023 Was it actually rebuilding when you took those diagnostics? Anything writing to your array while rebuilding? Probably unrelated, but I've never seen this much used docker.img Filesystem Size Used Avail Use% Mounted on /dev/loop2 1000G 766G 231G 77% /var/lib/docker Often 20G docker.img is plenty. I seldom use more than half of 20G running 15 dockers. You are doing it wrong. The usual reason for filling docker.img is an application writing to a path that isn't mapped. Also, you are using a version of Unraid that doesn't tell us which disks your user shares are using, so there may be other things about your configuration that could be improved. Quote Link to comment
dziewuliz Posted March 11, 2023 Author Share Posted March 11, 2023 (edited) Array was turned on, data rebuilding was paused. I might have had DiskSpeed running (to compare performance for mainboard SATA and LSI HBA speeds) when exporting diagnostics. Thanks for the callout on docker.img size - will look into that. Quote Also, you are using a version of Unraid that doesn't tell us which disks your user shares are using Updating to unRAIDServer-6.11.5... Edited March 11, 2023 by dziewuliz Quote Link to comment
dziewuliz Posted March 11, 2023 Author Share Posted March 11, 2023 tower-diagnostics-20230312-0027.zip Quote Link to comment
dziewuliz Posted March 11, 2023 Author Share Posted March 11, 2023 (edited) @trurl thanks, reclaimed space by removing dangling docker volumes, after installing Portainer. Edited March 11, 2023 by dziewuliz Quote Link to comment
trurl Posted March 11, 2023 Share Posted March 11, 2023 2 hours ago, dziewuliz said: unRAIDServer-6.11.5... system share has files on the array. 1 hour ago, dziewuliz said: reclaimed space Post new diagnostics Quote Link to comment
dziewuliz Posted March 12, 2023 Author Share Posted March 12, 2023 @trurl I have really neglected my setup, thanks for helping me clear it up. I've rebuilt docker.img, and made system share cache only. tower-diagnostics-20230312-1836.zip Quote Link to comment
dziewuliz Posted March 12, 2023 Author Share Posted March 12, 2023 Switched back from LSI HBA back to SATA expansion board, and replaced cables. Now rebuilding at steady 240MB/s. Quote Link to comment
Solution trurl Posted March 12, 2023 Solution Share Posted March 12, 2023 6 hours ago, dziewuliz said: made system share cache only Simply changing that setting just controls what it will do with new files, and mover ignores cache:only You have to set it cache:prefer to get mover to move it to cache. And nothing can move open files, so you have to disable Docker and VM Manager in Settings so it can be moved. Also, mover won't replace files, so if anything is already on cache you will have to decide whether to keep those or the system files on disk1. Install Dynamix File Manager plugin it will let you work with files directly on the server. What do you get from command line with this? ls -lah /mnt/cache/system and this? ls -lah /mnt/disk1/system Quote Link to comment
dziewuliz Posted March 13, 2023 Author Share Posted March 13, 2023 Would like to try that, however my overnight rebuilt now resulted in my parity disk down. tower-diagnostics-20230313-0900.zip Quote Link to comment
trurl Posted March 13, 2023 Share Posted March 13, 2023 Diagnostics contains the current syslog. Current syslog is in RAM like the rest of the OS, so can't tell us anything about what happened before reboot. Do you have diagnostics or syslog from before rebooting? Quote Link to comment
dziewuliz Posted March 13, 2023 Author Share Posted March 13, 2023 (edited) Unfortunately, when I checked in the morning, it had been frozen/non-responsive, so I had to restart the server manually. Anything I can do to identify what has happened? It's like walking on a mine field... Edited March 13, 2023 by dziewuliz Quote Link to comment
JorgeB Posted March 13, 2023 Share Posted March 13, 2023 Screenshot suggests parity got disabled during the rebuild, since SMART looks OK but there's a recent UDMA CRC error it likely happened due to a bad SATA cable. Quote Link to comment
dziewuliz Posted March 19, 2023 Author Share Posted March 19, 2023 Sorry for the delay in response - had to take a break from fruitless troubleshooting. Adding some additional background on what happened prior to the issue: My previous motherboard had 6 SATA ports (and I have 5 drives), and I switched to a new mb with 4 ports. My first sata expansion board (15$, purhcased on Amazon) shorted upon powering up - I immediately powered off, got myself a PCIe test card, to check there's any damage to the slot itself - looked good. Since then, I used bought another make of SATA expansion board, new SATA cables and connected the drives, and these problems started. Currently, I have parity disabled, and upon rebooting shows crc error count 2. it also happens to be the one connected to the expansion card. What are my options here? 1. Switch to LSI SATA SAS 9211-8i I tried before (albeit at super slow speed). 2. Look for alternative ways to connect the 5th disk Quote Link to comment
JorgeB Posted March 20, 2023 Share Posted March 20, 2023 Try a different controller, you can take a look here for some recommendations. Quote Link to comment
dziewuliz Posted March 25, 2023 Author Share Posted March 25, 2023 Thanks both - after taking the suggested steps, cleaning up my docker mess, my new LSI controller works full speed, and I was able to rebuild parity without issues this time. Lesson learned - stick to the recommended LSI SAS controllers, and avoid the inexpensive SATA expansion boards. @JorgeB let me know your donation link please. 1 Quote Link to comment
JorgeB Posted March 25, 2023 Share Posted March 25, 2023 You can buy me a beer at paypal.me/blackjohnnie Thanks. Quote Link to comment
dziewuliz Posted March 28, 2023 Author Share Posted March 28, 2023 Beer bought. Thanks again both! 2 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.