Jump to content

Unraid 6.12.4 Crashing every 2 days


Recommended Posts

I am running Unraid 6.12.4. Used to be stable for months without a reboot. Now it is crashing about every 2 days. Please let me know what information I can provide that may make this easier to troubleshoot. Here is the last syslog from USB:

 

Oct 10 23:08:13 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="745" x-info="https://www.rsyslog.com"] start
Oct 10 23:09:00 Tower root: Fix Common Problems Version 2023.10.08a
Oct 10 23:09:06 Tower root: Fix Common Problems: Warning: Syslog mirrored to flash
Oct 10 23:42:33 Tower kernel: md: recovery thread: P corrected, sector=876084888
Oct 10 23:46:37 Tower kernel: md: recovery thread: P corrected, sector=969524584
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:23:16 Tower kernel: ACPI BIOS Error (bug): Failure creating named object [\_SB.PC00.PEG1.PEGP._DSM.USRG], AE_ALREADY_EXISTS (20220331/dsfield-184)
Oct 11 00:23:16 Tower kernel: ACPI Error: AE_ALREADY_EXISTS, CreateBufferField failure (20220331/dswload2-477)
Oct 11 00:23:16 Tower kernel: ACPI Error: Aborting method \_SB.PC00.PEG1.PEGP._DSM due to previous error (AE_ALREADY_EXISTS) (20220331/psparse-529)
Oct 11 00:25:00 Tower kernel: md: recovery thread: P corrected, sector=1839214176
Oct 11 00:48:44 Tower kernel: md: recovery thread: P corrected, sector=2372122256
Oct 11 00:50:29 Tower kernel: md: recovery thread: P corrected, sector=2411118792
Oct 11 02:09:43 Tower kernel: md: recovery thread: P corrected, sector=4160821760
Oct 11 02:14:35 Tower kernel: md: recovery thread: P corrected, sector=4266034312
Oct 11 03:01:46 Tower kernel: md: recovery thread: P corrected, sector=5240352552
Oct 11 03:14:12 Tower kernel: md: recovery thread: P corrected, sector=5501928040
Oct 11 03:57:29 Tower kernel: md: recovery thread: P corrected, sector=6356297344
Oct 11 04:03:04 Tower kernel: md: recovery thread: P corrected, sector=6472447560
Oct 11 04:09:28 Tower kernel: md: recovery thread: P corrected, sector=6605372720
Oct 11 04:09:28 Tower kernel: md: recovery thread: P corrected, sector=6605537848
Oct 11 04:15:03 Tower kernel: md: recovery thread: P corrected, sector=6721718464
Oct 11 05:21:21 Tower kernel: md: recovery thread: P corrected, sector=8079225120
Oct 11 06:17:37 Tower kernel: md: recovery thread: P corrected, sector=9194786056
Oct 11 06:56:43 Tower kernel: md: recovery thread: P corrected, sector=9952555888
Oct 11 07:16:07 Tower kernel: md: recovery thread: P corrected, sector=10322880368
Oct 11 08:56:24 Tower kernel: md: recovery thread: P corrected, sector=12162674192
Oct 11 09:09:42 Tower kernel: md: recovery thread: P corrected, sector=12397344152
Oct 11 09:23:52 Tower kernel: md: recovery thread: P corrected, sector=12616149160
Oct 11 09:34:41 Tower kernel: md: recovery thread: P corrected, sector=12782272384
Oct 11 09:53:13 Tower kernel: md: recovery thread: P corrected, sector=13062719752
Oct 11 10:02:35 Tower kernel: md: recovery thread: P corrected, sector=13205059048
Oct 11 10:22:26 Tower kernel: md: recovery thread: P corrected, sector=13500627656
Oct 11 10:22:39 Tower kernel: md: recovery thread: P corrected, sector=13504039496
Oct 11 10:22:44 Tower kernel: md: recovery thread: P corrected, sector=13505534432
Oct 11 10:22:47 Tower kernel: md: recovery thread: P corrected, sector=13505936760
Oct 11 10:22:50 Tower kernel: md: recovery thread: P corrected, sector=13506821016
Oct 11 10:23:42 Tower kernel: md: recovery thread: P corrected, sector=13519806816
Oct 11 10:24:04 Tower kernel: md: recovery thread: P corrected, sector=13525245992
Oct 11 10:24:12 Tower kernel: md: recovery thread: P corrected, sector=13527411552
Oct 11 10:25:54 Tower kernel: md: recovery thread: P corrected, sector=13553125328
Oct 11 10:29:13 Tower kernel: md: recovery thread: P corrected, sector=13601607144
Oct 11 10:35:44 Tower kernel: md: recovery thread: P corrected, sector=13697549408
Oct 11 10:40:59 Tower kernel: md: recovery thread: P corrected, sector=13774226472
Oct 11 10:43:12 Tower kernel: md: recovery thread: P corrected, sector=13807621448
Oct 11 14:08:15 Tower kernel: md: recovery thread: P corrected, sector=16855416752
Oct 11 15:17:07 Tower kernel: md: recovery thread: P corrected, sector=17801609456
Oct 11 16:54:56 Tower kernel: md: recovery thread: P corrected, sector=19032553968
Oct 11 17:23:15 Tower kernel: md: recovery thread: P corrected, sector=19363653296
Oct 11 17:56:38 Tower kernel: md: recovery thread: P corrected, sector=19927825408
Oct 11 18:22:16 Tower kernel: md: recovery thread: P corrected, sector=20475680424
Oct 11 19:19:37 Tower kernel: md: recovery thread: P corrected, sector=21677416304
Oct 11 19:42:52 Tower kernel: md: recovery thread: P corrected, sector=22141062984
Oct 11 23:17:14 Tower kernel: md: recovery thread: P corrected, sector=25980343040
Oct 11 23:35:56 Tower kernel: md: recovery thread: P corrected, sector=26287418320
Oct 11 23:35:59 Tower kernel: md: recovery thread: P corrected, sector=26288371056
Oct 12 00:36:40 Tower kernel: md: recovery thread: P corrected, sector=27380217536
Oct 12 01:33:15 Tower kernel: md: recovery thread: P corrected, sector=28362361400
Oct 12 01:46:15 Tower kernel: md: recovery thread: P corrected, sector=28584385616
Oct 12 01:49:03 Tower kernel: md: recovery thread: P corrected, sector=28631703136
Oct 12 01:52:55 Tower kernel: md: recovery thread: P corrected, sector=28696582856
Oct 12 02:32:29 Tower kernel: mpt2sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready
Oct 12 02:32:29 Tower kernel: mpt2sas_cm0: fault_state(0x5861)!
Oct 12 02:32:29 Tower kernel: mpt2sas_cm0: sending diag reset !!
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: diag reset: SUCCESS
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: overriding NVDATA EEDPTagMode setting
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
Oct 12 02:32:30 Tower kernel: mpt2sas_cm0: sending port enable !!
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: port enable: SUCCESS
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for end-devices: start
Oct 12 02:32:37 Tower kernel: scsi target9:0:0: handle(0x0009), sas_addr(0x4433221100000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:0: enclosure logical id(0x5d4ae520ac07d109), slot(3)
Oct 12 02:32:37 Tower kernel: scsi target9:0:1: handle(0x000a), sas_addr(0x4433221101000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:1: enclosure logical id(0x5d4ae520ac07d109), slot(2)
Oct 12 02:32:37 Tower kernel: scsi target9:0:2: handle(0x000b), sas_addr(0x4433221102000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:2: enclosure logical id(0x5d4ae520ac07d109), slot(1)
Oct 12 02:32:37 Tower kernel: scsi target9:0:3: handle(0x000c), sas_addr(0x4433221103000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:3: enclosure logical id(0x5d4ae520ac07d109), slot(0)
Oct 12 02:32:37 Tower kernel: scsi target9:0:4: handle(0x000d), sas_addr(0x4433221104000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:4: enclosure logical id(0x5d4ae520ac07d109), slot(7)
Oct 12 02:32:37 Tower kernel: scsi target9:0:5: handle(0x000e), sas_addr(0x4433221105000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:5: enclosure logical id(0x5d4ae520ac07d109), slot(6)
Oct 12 02:32:37 Tower kernel: scsi target9:0:6: handle(0x000f), sas_addr(0x4433221106000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:6: enclosure logical id(0x5d4ae520ac07d109), slot(5)
Oct 12 02:32:37 Tower kernel: scsi target9:0:7: handle(0x0010), sas_addr(0x4433221107000000)
Oct 12 02:32:37 Tower kernel: scsi target9:0:7: enclosure logical id(0x5d4ae520ac07d109), slot(4)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for end-devices: complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for end-devices: start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for PCIe end-devices: complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for expanders: start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: search for expanders: complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: mpt3sas_base_hard_reset_handler: SUCCESS
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: _base_fault_reset_work: hard reset: success
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: removing unresponding devices: start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: removing unresponding devices: end-devices
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: Removing unresponding devices: pcie end-devices
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: removing unresponding devices: expanders
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: removing unresponding devices: complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: scan devices: start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     scan devices: expanders start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     break from expander scan: ioc_status(0x0022), loginfo(0x310f0400)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     scan devices: expanders complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     scan devices: end devices start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     break from end device scan: ioc_status(0x0022), loginfo(0x310f0400)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     scan devices: end devices complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     scan devices: pcie end devices start
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     break from pcie end device scan: ioc_status(0x0022), loginfo(0x3003011d)
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0:     pcie devices: pcie end devices complete
Oct 12 02:32:37 Tower kernel: mpt2sas_cm0: scan devices: complete
Oct 12 02:32:37 Tower kernel: sd 9:0:0:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:1:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:2:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:3:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:4:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:5:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:6:0: Power-on or device reset occurred
Oct 12 02:32:37 Tower kernel: sd 9:0:7:0: Power-on or device reset occurred
Oct 12 05:53:40 Tower kernel: md: recovery thread: P corrected, sector=32376732480
Oct 12 08:50:32 Tower kernel: md: recovery thread: P corrected, sector=34729748016
Oct 12 09:25:43 Tower kernel: md: sync done. time=123965sec
Oct 12 09:25:43 Tower kernel: md: recovery thread: exit status: 0
 

Link to comment

Thank you. I setup Kiwi Syslog Server Manager (free edition) and configured Unraid to log to the remote server. That worked and I was able to capture the logging data remotely. The results are the file attached named 'SyslogCatchAll...', and the tower-diagnostics were downloaded right after the latest crash (which didn't even make it 12 hours this time). Let me know if this is sufficient to help isolate the cause of the crashing.  

SyslogCatchAll-2023-10-16.txt tower-diagnostics-20231016-1203.zip

Link to comment

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Link to comment

Thanks for the quick reply @JorgeB. I will give that a shot. The interesting thing is that it was rock solid stable for months, then I finally upgraded to 6.12.4 and have only had this trouble since around that time. However, I did make one other change around the same time. I upgraded my motherboard's firmware to the latest version. Are there BIOS specific settings that could heavily impact the stability of Unraid? I did not make any BIOS changes other than to upgrade. If it helps, the motherboard is an ASUS Z690-P and I went from 2404 to 2802. 

PRIME Z690-P|Motherboards|ASUS Global

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...