SOLVED - 2 issues | Unresponsive server + disks falling out during reboot


Recommended Posts

Hey again,

 

Data has been copied out to spare drives (for the most part).

I stopped array, removed Parity 1 and disk 1 (6tb) started array, stopped array, then added only Disk 1back in (and left the parity 1 disk out), and started rebuild of the 6tb drive. (I did this around 9-10ish this morning, Saturday).

Now at around 12 (3 hours later) I came back to check, and system is all locked (again). WebGui does not respond and I'm back to previous picture earlier in the thread (Posted Thursday at 05:06 PM).

 

Does any1 have a clue WTF is going on?

Software error?
some of the components (i.e hardware) that is broke?

Where do I continue with the troubleshooting?

 

//Magmanthe
 

Edited by Magmanthe
Link to comment

the previous ZIP i added should containt any errors, cause the Syslog feature is active.


However there are a million folders and files there, so I don't know where or what to look for.

 

Seeing that the HW in there now is from ~2014 it is getting old, but it's not like it's ancient. I'm thinking that IF there is a HW-error it should be in either;
MB / CPU or RAM? But have you heard of other HW-components causing these kinds (or other kinds) of problems? (PSU, LSI-card, GPU or other?)


CPU - i7-4790K

MB - ASUS Z97-DELUXE 

RAM - HyperX Fury 4 x 8GB DDR3

LSISAS2308

 

 

As you can see the HW in there now is not Amazing but it is good enough for my usecase of NAS/fileServer/NextCloud/etc. It was actually my daily-driver PC up until January of last year, when I bought new PC-HW and this system got "demoted" to Unraid-system...

 

Dang this was annoying AF... :( 

 

 

 

//Magmanthe

Edited by Magmanthe
Link to comment

Well I only have that from the diagnostic-zip and the one that's in the appdata, but that syslog tells nothing. It just says server is up!

May  7 11:31:23 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock
May  7 12:19:16 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock
May  7 12:19:22 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock
May  9 09:26:52 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock
May  9 09:26:57 magnas /dev/log [info]: Server is up! /var/run/unraid-api.sock

 

unless there is a third syslog somewhere?

 

 

Anywho.. restarted the server this morning and now I'm pretty sure with all the rebuilding and locking up of the system, hard-resets (powerswitch) it's generally not a good time.

 

Is there ANY way to get some kind of wireframe/folder-structure fished out? (from the parity-disk or something?)

 I know the data might be gone, but if I could get some kind of overview of the root-folder structure, that would help a lot.

 

I do still have the old MB/CPU/RAM from my first-unraid-box laying around somewhere, I'll see if I can swap some parts around but without knowing what is failing it might not work.

Also I don't think there is a m.2-slot on that old stuff, so I cannot move the cache over to test.

 

 

//Magmanthe

unraid_syslog1.png

Link to comment

Well would you look at that...

So that would account for why the disk(s) sometimes "falls out" of the array, if I'm understanding that correctly..

 

But would that also account for the lock--up of the system, making WebGui totaly unrepsonsive and unreachable and unpingable on the network?
I mean, the Unraid-OS runs of a USB and that is directly connected to the I/O on the motherboard.. The HBA only deals with the disks, correct?

I mean, if I can replace the HBA, that is by far the cheaper-option than to replacing MB/CPU/RAM-setup...

 

UPDATE:
So when I bought the HBA i actually got two cards. 1 for the server and 1 that I use in my PC.

In the server there is a LSI SAS 9207-8i 

and in the PC there is a Dell Perc H310.

 

I can try to just swap over the DELL PERC card and see if the server stabilizes.

 

Also is there any way to find out if the whole HBA-card is fubar, or if maybe it's localized to 1 of the 2 "ports" or connection on the card?

 

//Magmanthe

 

Edited by Magmanthe
Link to comment
  • 4 weeks later...

Hey..

 

So long time no hear, but there’s been a bunch of stuff going on.

 

First I tried my other HBA to see if it works, but that had some kind of major non-compatibility (I think with the mobo) as no matter which PCIe-slot I stuck it into, there was nothing happening. I got no life in it, nothing in Bios, nothing… put it back in my computer, and it works perfectly again.

 

So I thought, let’s just skip the HBA and do SATA, but this also caused some issues. It seems that some of the SATA-ports on the MOBO were dead/not working and that meant I could not attach all the drives.

 

So I thought fuck it.. The HW is around 6-7 years old and it was around the time of my birthday, so I’ll gift myself some new parts.. Got a new CPU, Motherboard and RAM, as well as I did order a new HBA as well.

-          CPU – Ryzen 5 3600

-          MB - MSI x470 Gaming Plus Max

-          RAM – Crucial Ballistix 32GB

-          HBA - LSI 9211-8i

 

After I got all the parts I put it together and started the Server. The HBA is detected, and all drives are also detected with the HBA-card, so that’s good.

 

However with all the back and forth with the previous HW and multiple attempts at rebuilding of Parity, that probably failed due to the old faulty HBA-card, the data on Disk 1 is gone.

Once the Server started with the new HW, it finished up a Parity-Rebuild, but Disk1, was now empty. Kind of sucks, but such is life..

 

But now, there is 1 new problem, which is to do specifically with the M2.drive I use as Cache-drive.

But I’m making a new post for that, as this one can be closed.

 

Link to new post  

 

 

 

I would like to thank trurl and JorgeB 

for the initial help and support with the previous HW-issue regarding the HBA-card.. Thanks a bunch.

 

 

//magmanthe

Link to comment
  • Magmanthe changed the title to SOLVED - 2 issues | Unresponsive server + disks falling out during reboot

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.