squ Posted August 6 Share Posted August 6 I'm pretty new to unRAID and not entirely sure what's going on here so hoping some of you kind folk can give some advice. Not sure what's relevant, so I'll post whatever I can think of that might be useful I made some changes to my config a few days ago, I added 2x 240GB SSDs for use as cache in raid1 and moved appdata to there This morning I wanted to download a torrent, I don't usually leave the deluge docker running so I attempted to start it up, but it wouldn't start. Checked the logs, and saw errors like this Aug 6 13:35:19 Tower kernel: BTRFS warning (device loop2): csum failed root 580 ino 15772 off 4096 csum 0x74e015fc expected csum 0xd2908274 mirror 1 Aug 6 13:35:19 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 50, gen 0 Aug 6 13:35:19 Tower kernel: BTRFS warning (device loop2): csum failed root 580 ino 15772 off 4096 csum 0x74e015fc expected csum 0xd2908274 mirror 1 Aug 6 13:35:19 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 51, gen 0 Aug 6 13:35:19 Tower kernel: BTRFS warning (device loop2): csum failed root 580 ino 15772 off 4096 csum 0x74e015fc expected csum 0xd2908274 mirror 1 Aug 6 13:35:19 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 0, flush 0, corrupt 52, gen 0 I checked for some advice here and saw recommendations to run balance, which I did but it didn't solve the issue. While looking at the settings for cache I noted that the dialogue in the balance window stated only 160GB of space was available and only a tiny amount of free space. At one point last week I had a 120GB and a 240GB cache drive, and unRAID reported an available space of 160GB. I got to thinking that the way I turned on/upgraded cache might have caused an issue, so I was preparing to start again. I was checking that I had valid backups of my appdata in case anything went wrong during the process of moving things from cache to array. Looking at my backup directory I noted I had around 120gb of space used up by old backups that failed, so I deleted those. Straight after that, one of my hard drives started making a strange chirping noise and then a kind of scraping noise. I wanted to spin down the array because it seemed like the right thing to do, but the button in the ui wasn't working (the confirmation pop up closed, but the usual unraid animation did not appear). So I kicked off a shutdown from the ui. The reboot stalled at one of the unmount operations and after around 20 mins I powered off the machine and booted up again. On boot I checked the array and saw that one of my disks (disk4) is being reported as unassigned, so presumably that's the drive that was making weird noises and has now died. I stopped the array, and pulled that disk and reinserted on the off chance it was a loose connection or something. Upon restarting the array I saw a message "Unmountable: Unsupported or no file system" against all my drives, including the cache ones. I'm able to browse shares as normal, but the Main UI is prompting me to format all the drives, which naturally would be a silly thing to do. How do I proceed safely without risking loss of data? The failed drive is a 4TB, I have an 8TB drive already in the array with plenty of free space that I can move the emulated data to if that's an option tower-diagnostics-20240806-1352.zip.crdownload Quote Link to comment
squ Posted August 6 Author Share Posted August 6 At this point my array is stopped, and I'm unable to start it up again. Errors in the log when I hit "start" Aug 6 14:06:32 Tower emhttpd: cmdStart: already started Aug 6 14:06:37 Tower smbd[39789]: [2024/08/06 14:06:37.853599, 0] ../../source3/smbd/smb2_service.c:772(make_connection_snum) Aug 6 14:06:37 Tower smbd[39789]: make_connection_snum: canonicalize_connect_path failed for service rootshare, path /mnt/user Aug 6 14:06:37 Tower smbd[39789]: [2024/08/06 14:06:37.862840, 0] ../../source3/smbd/smb2_service.c:772(make_connection_snum) Aug 6 14:06:37 Tower smbd[39789]: make_connection_snum: canonicalize_connect_path failed for service rootshare, path /mnt/user Aug 6 14:06:37 Tower smbd[39789]: [2024/08/06 14:06:37.872141, 0] ../../source3/smbd/smb2_service.c:772(make_connection_snum) Aug 6 14:06:37 Tower smbd[39789]: make_connection_snum: canonicalize_connect_path failed for service rootshare, path /mnt/user Aug 6 14:06:37 Tower smbd[39789]: [2024/08/06 14:06:37.881696, 0] ../../source3/smbd/smb2_service.c:772(make_connection_snum) Aug 6 14:06:37 Tower smbd[39789]: make_connection_snum: canonicalize_connect_path failed for service rootshare, path /mnt/user Quote Link to comment
trurl Posted August 6 Share Posted August 6 btrfs csum errors are often caused by bad RAM. Have you done memtest? Quote Link to comment
squ Posted August 6 Author Share Posted August 6 Just now, trurl said: Have you done memtest? Not since I set up the server a few months back, I let it run for about 36 hours (mainly because I didn't realise it just runs for as long as you leave it) with no errors. Would I test again by rebooting and selecting it from the blue box when it appears? Quote Link to comment
squ Posted August 6 Author Share Posted August 6 Since the UI was unresponsive I've rebooted, and the error that was of most concern to me (Unsupported or no file system showing on all disks) has gone. I still have a borked drive, but that's easy enough to sort. I'll do a mem test overnight if RAM is likely to be the culprit for the original problem Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.