Jump to content

jowy_ham

Members
  • Posts

    18
  • Joined

  • Last visited

Posts posted by jowy_ham

  1. 24 minutes ago, JorgeB said:

    If this started out of the blue without any changes it sounds more like a hardware issue, but you can try enabling the syslog server and post that after a crash in case there's something visible there.

    Syslog enabled.

    Sep 15 11:52:44 Tower sSMTP[8441]: Creating SSL connection to host
    Sep 15 11:52:45 Tower sSMTP[8441]: SSL connection using TLS_AES_256_GCM_SHA384
    Sep 15 11:52:48 Tower sSMTP[8441]: Sent mail for [email protected] (221 2.0.0 closing connection j5-20020a17090aeb0500b0026b4ca7f62csm1999412pjz.39 - gsmtp) uid=0 username=root outbytes=910
    Sep 15 12:10:59 Tower emhttpd: spinning down /dev/sdg
    Sep 15 12:11:02 Tower emhttpd: read SMART /dev/sdg
    Sep 15 12:41:02 Tower emhttpd: spinning down /dev/sdg
    Sep 15 12:41:05 Tower emhttpd: read SMART /dev/sdg
    Sep 15 13:11:06 Tower emhttpd: spinning down /dev/sdg
    Sep 15 13:11:09 Tower emhttpd: read SMART /dev/sdg
    Sep 15 13:41:10 Tower emhttpd: spinning down /dev/sdg
    Sep 15 13:41:13 Tower emhttpd: read SMART /dev/sdg
    Sep 15 14:11:13 Tower emhttpd: spinning down /dev/sdg
    Sep 15 14:11:15 Tower emhttpd: read SMART /dev/sdg
    Sep 15 14:41:16 Tower emhttpd: spinning down /dev/sdg
    Sep 15 14:41:19 Tower emhttpd: read SMART /dev/sdg
    Sep 15 17:46:05 Tower file.activity: Starting File Activity
    Sep 15 17:46:05 Tower emhttpd: Starting File Activity...
    Sep 15 17:46:05 Tower file.activity: File Activity inotify starting

     

    @17:46, the system was forced resetted

     

    Nothing much was logged prior to that

  2. Recently my UnRAID system has been crashing randomly. 

     

    Can't find any errors in syslog. System was still working fine at 11+am then suddenly, unable to access web and SSH in at 5pm.

     

    Only task it was doing was preclearing 1 x 12TB HDD, nothing intensive was performed during this period.

     

    I have the following running (24x7):

    1 x piHole

    1 x Linux for torrenting

    1 x Windows 2019 for DHCP services

     

    These crashes will rendered the server unresponsive:

    - No display output (most of the time, the system runs headless). I have tried to connect up a monitor when it crashes. but nothing shown (black screen)

    - Keyboard not responding (Alt+Ctrl+Del) but Caps lock lights up

    - web interface not accessible

    - SSH not accessible

     

    And after hard resetting the system, I would encountered random disks having disk errors (during auto parity checks upon array restarted) and these errors will disappear upon another reboot (I will shutdown the array and reboot the system again) without meddling with any HW connections.

     

    And sometimes 1 or 2 disks will be disabled due the errors, thus it will trigger a RAID rebuild  which is .......

     

    Attached is my diagnostic logs, hope experts can help

     

    tower-diagnostics-20230915-1749.zip

  3. 41 minutes ago, trurl said:

    Connection problems on disk 4, check connections, both ends, SATA and power, including splitters.

     

    Why do you have 50G docker.img? 20G is often more than enough.

     

    Even more, why 100G libvirt? I don't think anyone has ever needed more than default 1G for that.

    At which log/area did U find that it is disk4 that is give the issue ? so that I may learn how to troubleshoot the issue in the future

     

    For issue 2 & 3, how do I fix those "oversized" issue ? Do I have to recreate ?

  4. Previously I won't have any parity check sync errors for months. But lately, after I replaced 1 x HDD (faulty) and 1 x 8087 cable (replaced cable due to UDMA CRC error count). Sync errors have been popping up. Is that a cause for concern ?

     

    For example, I did a parity check on :

    24-Jan-2022, there were 766 sync errors

    31-Jan-2022, there were 272 sync errors

    01-Feb-2022, there were 128 sync errors

     

    I'm in the progress of my 4th parity checks, and there are still 128 sync errors

     

    Can experts please kindly advise on what I should do ? 

     

    Attached is the diagnostics ZIP for your reference.

     

    By the way, in the event that I need to replace a current HDD (8/10TB) with a higher capacity HDD (14TB), what's the correct procedure ?

    Is it use new 14TB HDD to replace either 1 of the parity HDD, then use the replaced parity HDD as normal data HDD

     

     

    tower-diagnostics-20220202-2117.zip

×
×
  • Create New...