aurevo

Members
  • Posts

    179
  • Joined

  • Last visited

Posts posted by aurevo

  1. 1 hour ago, trurl said:

    Looks like that one disconnected and then reconnected as a different device, so it is now an Unassigned Device.

     

    Don't see the Toshiba that was assigned as disk2.

     

    How are you getting power to these disks? Any splitters? Too many on one PSU cable?

     

     

    Any way to get it back to system as the "old drive".

    With the Toshiba hard disk and if it is not defective and disk 1 would be recognized again, the parity would be valid and I only have to rebuild the second parity instead of rebuilding one parity and one data disk which is actually the same as before.

     

    I powered of the system, changed power cord and unplugged and replugged the sata data cable and now the HDD is back again.

     

    Looking forward if some errors appear again.

     

    At the moment I use 3 power cords from one PSU to the HDDs, some of them with splitter, some without.

    backup-diagnostics-20240220-1918.zip

  2. On 2/19/2024 at 5:14 PM, trurl said:
    Feb 17 22:05:20 Backup kernel: I/O error, dev sdj, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2
    Feb 17 22:05:20 Backup kernel: sd 7:0:6:0: Power-on or device reset occurred
    

    Some of this on disk2.

     

    Post new diagnostics.

     

    I ran a Lubuntu Live Disk from USB stick for the last approx. 36 hours without any crash or something.

     

    After restarting the above mentioned disk is missing.

     

    Disk 1 is also missing, but should be HGST_HUS726060ALE614_W2503880 (maybe wrong drive letter or something after switching cables or so)

    backup-diagnostics-20240220-1727.zip

  3. On 2/15/2024 at 1:08 PM, JorgeB said:

    Nothing obvious that I can see, a couple of smartctl segfaults and some ATA errors, if you leave the server idle without doing anything does it still crash?

     

    Yes, restarted the server more than one time after hanging and after restart without doing anything else it freezes again.

     

    Interesting is, that in the meantime of the memtest nothing everything was okay. But maybe that was coincidence.

    syslog-10.10.10.21.log

  4. 3 hours ago, JorgeB said:

    If the server is hanging/crashing you will need to try and fix that first, extremely unlikely that a SMART test is crashing the server.

     

    Are there any hints in the logs as to what could be causing the crash or the system hang?

     

    The server ran for months without any problems or dropouts, I just installed the same components in a new case and replaced the hard disks.

     

    In the course of this I only updated UnRAID to the latest version, but at least I had no problems with this on my other system.

  5. 9 minutes ago, JorgeB said:

    This happens if something else accesses the disk, try again.

     

    Tried it several times. Same error each time.

     

    Feb 14 17:14:18 Backup kernel: ata1: link is slow to respond, please be patient (ready=0) Feb 14 17:14:22 Backup kernel: ata1: softreset failed (device not ready) Feb 14 17:14:23 Backup kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Feb 14 17:14:23 Backup kernel: ata1.00: configured for UDMA/133

     

    Also after a reboot, array stopped and disk not in slot.

  6. On 2/12/2024 at 6:02 PM, JorgeB said:

    Though in the syslog it still looks more like a power/connection issue disk1 may be failing, run an extended SMART test on that disk.

     

     

     

    Changed power cable to another string from PSU and changed SATA cable from D2607 to onboard controller.

     

    Some time after starting extended SMART test, the system hang and became unavailable.

     

    Does it looks like defect HDD or still cable or power? An what could be the reason for the whole system to hang?

    backup-diagnostics-20240213-1813.zip syslog syslog-previous

  7. On 2/5/2024 at 2:02 AM, trurl said:

    Looks like this controller is causing problems for disks 1 and 2.

     

     

    I changed the adapter to a crossflashed D2607-A21.

    After a few hours one of the parity disks had a red cross for defect/missing so I changed it today against a new HDD.

     

    I started parity rebuild, but some moments ago Disk 1 had the same red X, so I shutdown device to look forward what to do next. It's so annoying.

    backup-diagnostics-20240212-1704.zip

  8. On 1/29/2024 at 3:50 PM, trurl said:

    Correct. And

     

     

    So I actually changed all SATA data cables to new ones and started a parity rebuild to dual parity folowing you tips.

     

    The rebuild was successful but this night system got unreachable and I hat to restart it unclean.

     

    As for now I see several errors in system log. Can you check if this could be a connection problem or are there some possible hardware failures.

     

    For now I don't know what I should do next. Maybe change an HDD or change SATA adapter or something else.

    And I don't know why the system froze or was unresponsible this night/morning.

    backup-diagnostics-20240204-1443.zip syslog

  9. 9 minutes ago, JorgeB said:

    Looks fine now, no more ATA errors and SMART looks OK.

     

     

    If the old disk is mounting and SMART looks OK you can resync parity (including parity2 at the same time if you want), then copy the data back.

     

    So for check and double check:


    My plan would be to assign all drives as before, changing Disk 3 (the empty new one) against the old one with data on it and changing from single parity to dual parity.

     

    Would this be an option?

  10. 9 minutes ago, trurl said:

    I see now after reviewing thread.

     

    You can assign it back, but don't format if it gives you that option.

     

    And you will have to rebuild parity.

     

    Since I have to restore parity in this case in any case, would it be possible to go directly to dual parity with two new 8TB hard disks or is there something in the way of this option?

  11. 24 minutes ago, trurl said:

     

    I have a screenshot of the old configuration, so I know the old disk assignment.

     

    The only thing I changed was Disk 3 against a new HDD, because I thought it was defect. So I invalidated parity.

     

    What would be the correct procedure to get the system up and running again?

    Assign the hard disks as before and then start the array?

  12. 29 minutes ago, JorgeB said:

    Just one at the moment:

     

    Device Model:     ST4000DM004
    Serial Number:    WFN1DE6W

     

    It may also be a disk problem since it's not even giving a valid SMART report, try swapping cables with a different disk, ideally one using a different controller to rule that out.

     

    I changed the SATA cable and connected the HDD to the onboard controller instead of the other one.

     

    Does this looks better in logs or still errors? Should this errors appear in system log or another one?

    backup-diagnostics-20240129-1443.zip

  13. 8 minutes ago, JorgeB said:

    Still having ATA errors, note that using a Marvell controller and a controller with SATA port multipliers is not recommended, especially both together.

     

    Only on one specific disk or on more than one?

     

    I did not had any other problems with this controller constellation. Also running such an controller in my main UnRAID build without problems.

     

     

    I currently think the problems elsewhere. But I only think so.

  14. On 12/22/2023 at 8:07 PM, trurl said:

    Your array configuration is on the flash drive in the config folder, just like the rest of the configuration. So that suggests a flash drive problem.

     

    And if that part of your configuration is missing have to wonder what else might be. All settings from the webUI are in config on flash.

     

    Post new diagnostics.

     

    Hi, 

     

    I am sorry to write back this late.

     

    Ordered a new case and assambled everything in the new case today.

     

    The disk configuration is still lost.

     

    Checked usb device, looks good, is read-and writable from within Windows and UnRAID itself.

     

    New diagnostics attached. Hope to recover all my data or get the system running.

    backup-diagnostics-20240129-1243.zip

  15. 44 minutes ago, trurl said:

    Since you say this is a backup server, I assume you have the files somewhere else and can just back them up again.

     

    Connection problems with parity disk. Cancel rebuild, shutdown, check all connections, both ends, SATA and power, including splitters. Reboot, restart rebuild, post new diagnostics.

     

    I am able to mount the "defect" disk with Unattended Devices Plugin without starting the array. I will run an extended SMART check, maybe it is not faulty.

     

    I think I will order new cables and splitters or change SATA adapter card.

     

    After starting the server a few minutes ago I now have an empty disk configuration.

    For my luck I created a screenshot of the config.

    Would it be the way to assign the drives as before to restart the rebuild or is there anything else I have to care of?

  16. 54 minutes ago, trurl said:

    It would not have said exactly that. If the disk had an unmountable filesystem, then it would have given you a checkbox to allow you to format, but you should never format a disk that has data on it you wish to keep.

     

    Format is a write operation that creates an empty filesystem on the disk. Format updates parity like all write operations (how could parity be valid otherwise). So after formatting a disk in the array, parity is in sync with the empty filesystem on that disk, and rebuilding can only rebuild an empty filesystem.

     

    Did it have a RED X next to it?

     

    The correct way to deal with an unmountable disk is with check filesystem. And if the disk is also disabled (RED X), it is still emulated by parity. In that case, you would still check filesystem so the emulated filesystem could be repaired before rebuilding.

     

    Why do you think it was faulty? Connection problems are much more common than bad disks. Even if the disk did need replacing, maybe it can still be read well enough to get files from it.

     

    The disk was displayed as missing, after restart I could select it back fot the slot and than it said I have to format it.

    The disk long before that had SMART errors, so I thought it would be a good idea to replace it.

     

    Quote

    When the diagnstics were taken, it looks like it was still rebuilding disk3, and it is empty.

     

    Yes, I fomated the disk and started rebuilding.

    After that I recognised that it was incredible slow in rebuilding and that the disk was empty, which was admittedly a mistake on my part.

     

    What would be the most appropriate steps to take to try to access my data?

    In addition, the hard disks are encrypted, does this make a difference?