greencode Posted January 24, 2020 Share Posted January 24, 2020 (edited) My server was working fine about a week ago and I did a clean shut down. It was down for about a week. Turned it on yesterday and 3 of the drives are listed as unmountable. And disk 8 has an error state. They are all new drives, that were formatted. When I got to the drive's page it does list xfs as the file system. I have been reading the boards and saw some of the suggestions of attempting to repair the mbr? Do I type the commands into the terminal thing on the web site or do I need to plug a keyboard directly into the server and type it in there. There is not important data on drives 9 and 10, but I do have data on drive 8. Any advice is greatly appreciated. tower-diagnostics-20200124-1107.zip Edited January 25, 2020 by greencode clarification, spelling Quote Link to comment
Decto Posted January 24, 2020 Share Posted January 24, 2020 (edited) I'm not an expert but from what I can see, all the unmountable drives are connected to the SAS card, all the mounted drives are connected to the internal motherboard ports. Disks 8,9 and 10 all threw errors during initialisation but have clean smart data. The SAS card seems to have repeatedly trying to connect to the 3 drives. I'd check the seating of the SAS card and cabling and that any external expander / JBOD case has clean power. Your disks are likely fine, probably connection issues. Edited January 24, 2020 by Decto Quote Link to comment
greencode Posted January 25, 2020 Author Share Posted January 25, 2020 (edited) Do you think stopping and restarting the array will help, with everything on? Also would that cause disk 8 to be disabled? Edited January 25, 2020 by greencode Quote Link to comment
JorgeB Posted January 25, 2020 Share Posted January 25, 2020 The HBA were those 3 drives are connected is constantly re-initializing, power down, make sure it's well seated, you can also try a different PCIe slot if available, then power back on, if no more errors all disks should mount correctly (disk8 might not if the file system on the emulated disk got corrupted), in any case post new diags after doing that. Quote Link to comment
greencode Posted January 25, 2020 Author Share Posted January 25, 2020 Got it working I think. I pulled the card out and reseated it. Restarted the system, got the following error message and it just hung for like 10 minutes. Google told me it might be a power/connector issue so I forced a shut and unconnected an reconnected the power to the drives. Booted up the second time, however drive 8 was listed as missing. Stopped the array added the drive back and did a rebuild (happening now) not sure I should have done it but here is the diags. Hope I did the right thing. tower-diagnostics-20200125-0025.zip Quote Link to comment
JorgeB Posted January 25, 2020 Share Posted January 25, 2020 The crash on boot is likely this issue, you just need to try again, reportedly power cycling increases the chances of success, instead of a warm reboot. As for the HBA, no errors so far, so everything looks good for now. Quote Link to comment
greencode Posted January 25, 2020 Author Share Posted January 25, 2020 (edited) 2 hours ago, johnnie.black said: The HBA were those 3 drives are connected is constantly re-initializing, Is this a system of the PCIe card I bought, maybe being incompatibly? I don't know if you can tell from the data but it is a LSI SAS 9207-8i card that I bought on ebay. One of the comments said this card works with unraid so I tried it out. Edit: I see you ruled out the LSI as a possible suspect in the crash I encountered. Still not sure if the particular card is compatible. Is there a list posted somewhere of compatible hardware? I tried to find one when I was looking but could not. 2 hours ago, johnnie.black said: if no more errors all disks should mount correctly (disk8 might not if the file system on the emulated disk got corrupted) How would I check if the file system got corrupted? For future reference since I assume it is too late to check. Since I started a rebuild. Thanks for all the help, I really appreciate it. Edited January 25, 2020 by greencode read the link you posted Quote Link to comment
JorgeB Posted January 25, 2020 Share Posted January 25, 2020 9207-8i is a very good option and compatible with Unraid, though thee could always be some compatibility issue with your particular board or a problem with the HBA itself. 14 minutes ago, greencode said: How would I check if the file system got corrupted? If the disk was unmountable there would be corruption, since it mounted correctly should be fine, but yes, you need to be careful before rebuilding on top, sometimes when this happens, when you know the disk wasn't the problem, it's better to do new config and re-sync/check parity instead or rebuilding the disk on top of the old one. Quote Link to comment
greencode Posted February 3, 2020 Author Share Posted February 3, 2020 (edited) On 1/25/2020 at 2:12 AM, johnnie.black said: The crash on boot is likely this issue, you just need to try again, reportedly power cycling increases the chances of success, instead of a warm reboot. When you say power cycling you mean a forced shutdown? I.e hold down the power button? Any advice to keep this from happening it seems every time I turn off an restart the server this happens. The server boot up fine but then it throws this error. I think it is related (but if it isn't and I need to make a new topic I can). Since on the online interface at the IP address it show disk 8 unmountable (emulated) and 9 and 10 unmountable no file system. tower-diagnostics-20200203-1430.zip Edited February 3, 2020 by greencode clarification Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 9 hours ago, greencode said: Any advice to keep this from happening it seems every time I turn off an restart the server this happens. If it was because of the the Intel NIC issue it was fixed on v6.8.2, update, but your problem seems different, LSI HBA is constantly faulting and resetting: Feb 3 14:20:08 Tower kernel: mpt2sas_cm0: fault_state(0x2622)! Feb 3 14:20:08 Tower kernel: mpt2sas_cm0: sending diag reset !! Feb 3 14:20:09 Tower kernel: mpt2sas_cm0: diag reset: SUCCESS This keeps repeating on the syslog, check HBA is well seated and/or try a different slot if available, you'll then need to check file system on the affected disks, but only after fixing the HBA issue. Quote Link to comment
greencode Posted February 4, 2020 Author Share Posted February 4, 2020 4 hours ago, johnnie.black said: If it was because of the the Intel NIC issue it was fixed on v6.8.2, update, but your problem seems different, LSI HBA is constantly faulting and resetting: Could this be because of a bad card? So I moved it to a new slot and it seems to have helped. The two drives (9 and 10) now show up normally. However the disk 8 is still disabled. How do I go about checking my file system before fixing the drive 8? tower-diagnostics-20200204-0428.zip Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 Could this be because of a bad card? It could, but if that's the case it should also start causing problems on the current slot. However the disk 8 is still disabled. That's expected, once a disk is disabled it needs to be rebuilt. Disk8 is mounting correctly, so no need to fix filesystem for now, just rebuild on top. Quote Link to comment
greencode Posted February 4, 2020 Author Share Posted February 4, 2020 6 hours ago, johnnie.black said: That's expected, once a disk is disabled it needs to be rebuilt. Disk8 is mounting correctly, so no need to fix filesystem for now, just rebuild on top. What would be the correct way to do this? Last time I stopped the array and removed disk 8, then I restarted the array, stopped it one more time and re-added the drive and pressed rebuild, this ok? Quote Link to comment
JorgeB Posted February 4, 2020 Share Posted February 4, 2020 1 minute ago, greencode said: this ok? Yep Quote Link to comment
greencode Posted February 4, 2020 Author Share Posted February 4, 2020 Just now, johnnie.black said: Yep OK, thanks for all your help. Hopefully I wont need any more. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.