December 5, 20241 yr Hi, I'm sure this has been asked many times but it is scary the first time this happens and I want to make sure that everything is done correcly. During the night a disk had read errors (more than 2000) and is now marked in error. I turned off the array but it is still saying " Array Stopping• Retry unmounting user share(s)..." after few hours. Of course I added a disk yesterday and moved some things around in the case so that could be causing the issue but the disk is still pre-clearing so I cannot turn off the server to check the cables. I have questions on how to proceed properly: - check the cables when I can turn off the server - IF not working still do I do some checks on the disk? Do I keep the array down until I know what is wrong with the disk? Also, if the disk is just broken, can I take it out and still use the server without the disk whitout losing the data on it (from parity)? Last question is, can I change that disk with a bigger one and then rebuild the array with all the data being written on the new drive? Thanks for the help! Edited December 7, 20241 yr by Eysenor
December 5, 20241 yr Community Expert Please post the diagnostics, if you cannot get them now, post after a reboot and array start.
December 5, 20241 yr Community Expert Solution Looks more like a power/connection issue, type powerdown in the CLI, if it doesn't shutdown after 5 minutes you will need to force, then check/replace cables for disk2 and post new diags after array start.
December 5, 20241 yr Author I tried to detach and reconnect all the cables to the missing drive and still it shows as disabled. I attached the new diagnostics. Also this time, if I try to stop the array it gets stuck to "retry to unmount user shares". And in the drive log it has errors like this: Dec 5 21:06:17 Homeserver kernel: I/O error, dev sdd, sector 15628052934 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 Dec 5 21:06:17 Homeserver kernel: Buffer I/O error on dev sdd1, logical block 15628052870, async page read Dec 5 21:06:17 Homeserver kernel: Buffer I/O error on dev sdd1, logical block 15628052871, async page read I run a SMART fast diagnostic and it passed. So could it be that the HBA card is somehow acting weirdly? It was working fine before I added another drive and then the drive failure happen (few days later though). If the drive is broken that is fine, I can get another. Mostly I want to be sure I take care of the data correctly. For example, is it possible to move the data from that drive to another that has space from the parity? homeserver-diagnostics-20241205-2138.zip Edited December 5, 20241 yr by Eysenor
December 6, 20241 yr Community Expert Emulated disk2 is mounting, if contents look correct, and you have already replaced the cables, you can rebuild on top. https://docs.unraid.net/unraid-os/manual/storage-management#rebuilding-a-drive-onto-itself
December 6, 20241 yr Author Thanks for the answer, I'll try that. The only issue still is that the array is not stopping and keeps being stuck at the retry to unmount user shares. How do I get the array to stop properly? I'm getting this error in the log: Dec 6 13:08:05 Homeserver root: rmdir: failed to remove '/mnt/user': Device or resource busy Dec 6 13:08:05 Homeserver emhttpd: shcmd (591): exit status: 1 Dec 6 13:08:05 Homeserver emhttpd: shcmd (593): rm -f /boot/config/plugins/dynamix/mover.cron Dec 6 13:08:05 Homeserver emhttpd: shcmd (594): /usr/local/sbin/update_cron Dec 6 13:08:05 Homeserver emhttpd: Retry unmounting user share(s)... How do I force stop the mnt/user? Edited December 6, 20241 yr by Eysenor
December 6, 20241 yr Community Expert Something is using the shares still, disable Docker and VM services and reboot in safe mode, see if you can then stop the array normally.
December 6, 20241 yr Author Thanks a lot for taking the time to help, the disk is now rebuilding, hopefully all is back to work tomorrow when it is done. Btw, for the next time, how do I identify if a disk has issues with connections and not with the disk itself? beside the SMART test passing.
December 7, 20241 yr Community Expert It's not always easy to do it, mostly based on the type of errors, SMART, controllers used, etc.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.