casperse Posted April 20, 2023 Share Posted April 20, 2023 (edited) !diagnostics-20230421-1305.zipHi All GHOST IN THE MACHINE? I really need some help to fix my Unraid server, seeing many new errors and not sure what's the cause of it? So far Unraid have ben pretty stable. Most of this just started during a rebuild of a new replacement drive nr. 18 I got this error: And log from Disk 1: But I guess it best to wait for the drive 18 to finish rebuild before trying to do a reboot and rebuild drive 1? Since it looks like I am emulating 2 drives! dISK 1 and disk 18 Drive 1 now shows up under Unassigned drives? On top of this I am getting some strange other strange errors and behaviors: Run out of memory message: Diagnostic attached diagnostics-20230420-1529.zip New Diag disk 1 was removed? !diagnostics-20230421-1305.zip Edited April 21, 2023 by casperse Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 So before disk 1 was removed it was named: And now it says (sde) Any input on how I can get my disk back into the array? Quote Link to comment
JorgeB Posted April 21, 2023 Share Posted April 21, 2023 Missed this post before, disk1 dropped offline, SMART looks good so check/replace cables and if the emulated disk is mounting and contents look correct you can rebuild on top. Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 I stopped the array removed disabled drive 1 and started array without VM & Docker I then added drive 1 and started the array again and it started a rebuild of drive 18 and emulated drive 1 BUT then it stopped and when I looked in the log file I got this: Now its writing error on drive 2 & 6 I have shutdown the server and now I dont know how to proceed? New diagnostic files attached here: !!diagnostics-20230421-1352.zip Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 Cables look fine, temperature fine, controller LSI Logic SAS 9305-24i Host Bus Adapter have connection to all drives. Quote Link to comment
Solution JorgeB Posted April 21, 2023 Solution Share Posted April 21, 2023 Disks 2 and 6 also dropped, this is usually a power/connection problem, could also be a bad PSU. Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 @JorgeB That could explain it! I dont believe its cables or the controller. So PSU actually makes sence! So I dont dare to turn it on before I have a replacement PSU (I report get back when I have been out shopping for one) Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 Okay @JorgeBI got a brand new 1000W Corsair PSU and I just booted the server now I get a new error message: I am pretty sure I have a backup on my Unraid account? BUT I can see the drive 18 is started to rebuild! and the logs doesn't show any errors! so far so good! Getting new errors again, but its still rebuilding (ETA 4 days!) So what should I do now? wait for two drives to rebuild? Do I need a new USB for Unraid? Quote Link to comment
JorgeB Posted April 21, 2023 Share Posted April 21, 2023 Did the flash warning appear only at array start or is still there? If still there post new diags. Also see if this helps with the PCIe errors: https://forums.unraid.net/topic/118286-nvme-drives-throwing-errors-filling-logs-instantly-how-to-resolve/?do=findComment&comment=1165009 Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 It only appeared at start up and I have not seen it since. I can see that it it did a backup of the USB to "My server" so that also worked! That's a good thing right. Like this: So the error is not related to drive failures but nvme drive errors (because of the SMART transfer warnings) Again thanks for helping me out! I would never have guessed that my Platinum Corsair AX860i power supply would cause problems, actually think the have a very long warranty have to check that. Quote Link to comment
JorgeB Posted April 21, 2023 Share Posted April 21, 2023 Just now, casperse said: It only appeared at start up You can ignore those. Quote Link to comment
casperse Posted April 21, 2023 Author Share Posted April 21, 2023 So I added the pcie_aspm=of and did a reboot and startet to rebuild and no errors but after some time I start getting these again? Quote Link to comment
JorgeB Posted April 22, 2023 Share Posted April 22, 2023 You can try a different PCIe slot for the HBA, if it's in a CPU slot try a PCH one, or vice versa. Quote Link to comment
casperse Posted April 22, 2023 Author Share Posted April 22, 2023 4 hours ago, JorgeB said: You can try a different PCIe slot for the HBA, if it's in a CPU slot try a PCH one, or vice versa. Thanks I will try that after rebuild is done. I think the HBA is in one of the x8 slots Unfortunately look like it is going to take (4-5 days) a very long time for the 18TB + 12TB drives to rebuild. Is there any tweaks I can use to do this faster? (Thinking of the disk settings, most is larger and faster drives) So far I think its very standard values (I have tried to search the forum but haven't found any newer post about this subjetc Quote Link to comment
JorgeB Posted April 22, 2023 Share Posted April 22, 2023 Since v6.8 tuning doesn't usually changes much, what is the current speed? Also post output of: lspci -d 1000: -vv Quote Link to comment
casperse Posted April 22, 2023 Author Share Posted April 22, 2023 Hi JorgeB Speed is at highest 120MB/sec and now pretty low 8.6 MB/sec most of the time: lspci -d 1000: -vv Quote You can try a different PCIe slot for the HBA, if it's in a CPU slot try a PCH one, or vice versa. My current PCie slot and placement of HW: PCIe slot 1: x8 NVIDIA Quadro P2000 PCIe slot 2: x4 NVIDIA GeForce RTX 3060 PCIe slot 3: x8 LSI Logic SAS 9305-24i Host Bus Adapter PCIe slot 4: x4 M.2. NVMe ICY BOX: IB-PCI215M2-HSL adapter New placement? PCIe slot 1: x8 LSI Logic SAS 9305-24i Host Bus Adapter PCIe slot 2: x4 NVIDIA Quadro P2000 PCIe slot 3: x8 NVIDIA GeForce RTX 3060 PCIe slot 4: x4 M.2. NVMe ICY BOX: IB-PCI215M2-HSL adapter Quote Link to comment
JorgeB Posted April 23, 2023 Share Posted April 23, 2023 19 hours ago, casperse said: Speed is at highest 120MB/sec and now pretty low 8.6 MB/sec most of the time: Post new diags when the speed is low like that. Quote Link to comment
casperse Posted April 25, 2023 Author Share Posted April 25, 2023 Hi Jorge B. It just finished it must have speeded up in the end....and it says that the array is ok for both drives now? When it was running it only stated it was rebuilding the 18TB drive and not Drive 1 (12TB) but according to Unraid my array is fine now? Anyway I am now looking into building another Unraid server as a Backup server (This was kind of a wakeup call) Again thanks for all your help so happy to be up and running again with parity drives on all drives! Quote Link to comment
JorgeB Posted April 25, 2023 Share Posted April 25, 2023 2 hours ago, casperse said: When it was running it only stated it was rebuilding the 18TB drive and not Drive 1 (12TB) but according to Unraid my array is fine now? If every disk has a green icon they are all enabled. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.