trurl Posted January 29 Share Posted January 29 I see now after reviewing thread. You can assign it back, but don't format if it gives you that option. And you will have to rebuild parity. Quote Link to comment
aurevo Posted January 29 Author Share Posted January 29 9 minutes ago, trurl said: I see now after reviewing thread. You can assign it back, but don't format if it gives you that option. And you will have to rebuild parity. Since I have to restore parity in this case in any case, would it be possible to go directly to dual parity with two new 8TB hard disks or is there something in the way of this option? Quote Link to comment
JorgeB Posted January 29 Share Posted January 29 47 minutes ago, aurevo said: I changed the SATA cable and connected the HDD to the onboard controller instead of the other one. Does this looks better in logs or still errors? Should this errors appear in system log or another one? Looks fine now, no more ATA errors and SMART looks OK. 38 minutes ago, aurevo said: But some posts ago I tried to mount the "old previous disk" via Unassigned Devides and that worked. So I think I can assign it back and will have access to the data, correct? If the old disk is mounting and SMART looks OK you can resync parity (including parity2 at the same time if you want), then copy the data back. Quote Link to comment
aurevo Posted January 29 Author Share Posted January 29 9 minutes ago, JorgeB said: Looks fine now, no more ATA errors and SMART looks OK. If the old disk is mounting and SMART looks OK you can resync parity (including parity2 at the same time if you want), then copy the data back. So for check and double check: My plan would be to assign all drives as before, changing Disk 3 (the empty new one) against the old one with data on it and changing from single parity to dual parity. Would this be an option? Quote Link to comment
trurl Posted January 29 Share Posted January 29 Correct. And 49 minutes ago, trurl said: don't format if it gives you that option. Quote Link to comment
aurevo Posted February 4 Author Share Posted February 4 On 1/29/2024 at 3:50 PM, trurl said: Correct. And So I actually changed all SATA data cables to new ones and started a parity rebuild to dual parity folowing you tips. The rebuild was successful but this night system got unreachable and I hat to restart it unclean. As for now I see several errors in system log. Can you check if this could be a connection problem or are there some possible hardware failures. For now I don't know what I should do next. Maybe change an HDD or change SATA adapter or something else. And I don't know why the system froze or was unresponsible this night/morning. backup-diagnostics-20240204-1443.zip syslog Quote Link to comment
aurevo Posted February 4 Author Share Posted February 4 The errors like Feb 4 15:00:14 Backup kernel: ata12.02: mean it is device [12:0:0:0]disk ATA HGST HUS726060AL WD05 /dev/sdi 6.00TB under system devices, correct? So only one HDD throughs this error messages at the moment right? Quote Link to comment
trurl Posted February 5 Share Posted February 5 On 1/29/2024 at 7:34 AM, JorgeB said: Still having ATA errors, note that using a Marvell controller and a controller with SATA port multipliers is not recommended, especially both together. Looks like this controller is causing problems for disks 1 and 2. Quote Link to comment
aurevo Posted February 12 Author Share Posted February 12 On 2/5/2024 at 2:02 AM, trurl said: Looks like this controller is causing problems for disks 1 and 2. I changed the adapter to a crossflashed D2607-A21. After a few hours one of the parity disks had a red cross for defect/missing so I changed it today against a new HDD. I started parity rebuild, but some moments ago Disk 1 had the same red X, so I shutdown device to look forward what to do next. It's so annoying. backup-diagnostics-20240212-1704.zip Quote Link to comment
JorgeB Posted February 12 Share Posted February 12 Though in the syslog it still looks more like a power/connection issue disk1 may be failing, run an extended SMART test on that disk. Quote Link to comment
aurevo Posted February 13 Author Share Posted February 13 On 2/12/2024 at 6:02 PM, JorgeB said: Though in the syslog it still looks more like a power/connection issue disk1 may be failing, run an extended SMART test on that disk. Changed power cable to another string from PSU and changed SATA cable from D2607 to onboard controller. Some time after starting extended SMART test, the system hang and became unavailable. Does it looks like defect HDD or still cable or power? An what could be the reason for the whole system to hang? backup-diagnostics-20240213-1813.zip syslog syslog-previous Quote Link to comment
JorgeB Posted February 13 Share Posted February 13 55 minutes ago, aurevo said: Some time after starting extended SMART test, the system hang and became unavailable. Server should not hang because of a disk, specially during a SMART test since that's done by the disk itself, try again. Quote Link to comment
trurl Posted February 14 Share Posted February 14 20 hours ago, aurevo said: the system hang Setup syslog server. Quote Link to comment
aurevo Posted February 14 Author Share Posted February 14 20 hours ago, JorgeB said: Server should not hang because of a disk, specially during a SMART test since that's done by the disk itself, try again. A few seconds after starting SMART check I got: Interrupted (host reset) syslog-10.10.10.21.log Quote Link to comment
JorgeB Posted February 14 Share Posted February 14 This happens if something else accesses the disk, try again. Quote Link to comment
aurevo Posted February 14 Author Share Posted February 14 9 minutes ago, JorgeB said: This happens if something else accesses the disk, try again. Tried it several times. Same error each time. Feb 14 17:14:18 Backup kernel: ata1: link is slow to respond, please be patient (ready=0) Feb 14 17:14:22 Backup kernel: ata1: softreset failed (device not ready) Feb 14 17:14:23 Backup kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 14 17:14:23 Backup kernel: ata1.00: configured for UDMA/133 Also after a reboot, array stopped and disk not in slot. Quote Link to comment
JorgeB Posted February 14 Share Posted February 14 2 minutes ago, aurevo said: Feb 14 17:14:18 Backup kernel: ata1: link is slow to respond, please be patient (ready=0) Feb 14 17:14:22 Backup kernel: ata1: softreset failed (device not ready) Feb 14 17:14:23 Backup kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 14 17:14:23 Backup kernel: ata1.00: configured for UDMA/133 These can also interrupt the test, replace/swap cables and try again. Quote Link to comment
aurevo Posted February 14 Author Share Posted February 14 3 hours ago, JorgeB said: These can also interrupt the test, replace/swap cables and try again. Changed SATA cable to one from crossflashed adapter and used another power cable too. SMART check was on 20% as system got partially unavailable again. Webinterface is unavailable, ping is possible but no connection possible via SSH, stuck at prompt. syslog-10.10.10.21.log Quote Link to comment
JorgeB Posted February 15 Share Posted February 15 If the server is hanging/crashing you will need to try and fix that first, extremely unlikely that a SMART test is crashing the server. Quote Link to comment
aurevo Posted February 15 Author Share Posted February 15 3 hours ago, JorgeB said: If the server is hanging/crashing you will need to try and fix that first, extremely unlikely that a SMART test is crashing the server. Are there any hints in the logs as to what could be causing the crash or the system hang? The server ran for months without any problems or dropouts, I just installed the same components in a new case and replaced the hard disks. In the course of this I only updated UnRAID to the latest version, but at least I had no problems with this on my other system. Quote Link to comment
JorgeB Posted February 15 Share Posted February 15 Nothing obvious that I can see, a couple of smartctl segfaults and some ATA errors, if you leave the server idle without doing anything does it still crash? Quote Link to comment
aurevo Posted February 18 Author Share Posted February 18 On 2/15/2024 at 6:22 PM, trurl said: Have you done memtest? Yes, image attached was the third run. Did the automatic starting run twice and than choose every test that was possible to choose and let it run. Image is from yesterday. It run longer until now and still no errors. Quote Link to comment
aurevo Posted February 18 Author Share Posted February 18 On 2/15/2024 at 1:08 PM, JorgeB said: Nothing obvious that I can see, a couple of smartctl segfaults and some ATA errors, if you leave the server idle without doing anything does it still crash? Yes, restarted the server more than one time after hanging and after restart without doing anything else it freezes again. Interesting is, that in the meantime of the memtest nothing everything was okay. But maybe that was coincidence. syslog-10.10.10.21.log Quote Link to comment
trurl Posted February 19 Share Posted February 19 Feb 17 22:05:20 Backup kernel: I/O error, dev sdj, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 Feb 17 22:05:20 Backup kernel: sd 7:0:6:0: Power-on or device reset occurred Some of this on disk2. Post new diagnostics. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.