Everything posted by scs3jb
-
[Plugin] [Support] Unraid Tab for AI CLI Coding agents (Antigravity CLI, Gemini CLI, Claude Code, OpenCode, Kilo Code, Pi Coder, Codex CLI, Factory Droid CLI, CoPilot, Nano Coder, Qwen Coder, Goose)
Updated to latest, was on a fork but seems to have issues with Paths: [2026-04-30 16:18:43] [ERR!] [guard_path] PERSIST_PATH outside allowed prefixes: /mnt/scratch_old/aicliagents [2026-04-30 16:18:43] [INFO] [MIGRATION] ERROR: Persistence path failed validation: /mnt/scratch_old/aicliagents [2026-04-30 16:21:37] [INFO] [AICliAgents] Initializing AICliAgents Plugin (v1.0)... [2026-04-30 16:21:38] [INFO] [StorageMountService] Mounting Home Stack for root [2026-04-30 16:21:38] [ERR!] [guard_path] PERSIST_PATH outside allowed prefixes: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:21:38] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:21:38] [ERR!] [StorageMountService] Mount script FAILED for home root: [2026-04-30 16:21:38] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence PERSIST_PATH outside of allowed prefixes a bug? From a fresh install: [2026-04-30 16:30:33] [ERR!] [guard_path] PERSIST_PATH outside allowed prefixes: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:33] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:33] [ERR!] [StorageMountService] Mount script FAILED for agent gemini-cli: [2026-04-30 16:30:33] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:33] [ERR!] [InstallerService] Failed to mount storage stack for gemini-cli [2026-04-30 16:30:33] [ERR!] [AICliAgents] Background Install Job FAILED for gemini-cli: Could not mount agent storage [2026-04-30 16:30:34] [INFO] [COMMIT] No changes to commit for home root [2026-04-30 16:30:34] [ERR!] [guard_path] PERSIST_PATH outside allowed prefixes: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:34] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:34] [ERR!] [StorageMountService] Mount script FAILED for agent claude-code: [2026-04-30 16:30:34] [ERR!] [MOUNT] Persistence path failed validation: /mnt/scratch_old/aicliagents/persistence [2026-04-30 16:30:34] [ERR!] [InstallerService] Failed to mount storage stack for claude-code [2026-04-30 16:30:34] [ERR!] [AICliAgents] Background Install Job FAILED for claude-code: Could not mount agent storage Seems to try and force me to use the usb / fixed paths?
-
[Support] selfhosters.net's Template Repository
Anyone set up Forgejo CI Runner? I took a quick scan, looks like you will need to run a new container that will run either podman or docker in it? Anyone made anything that plugs straight into unraid? https://forgejo.org/docs/latest/admin/actions/runner-installation/ https://dbushell.com/2025/08/15/self-hosted-forgejo-actions-runner/
-
Dual Link to SAS Expander gives link resets: SAS3224 with Super Mei TOP24 Expander
Thank you @JorgeB, dual link is now working with the firmware updated. Dual link in use, no more errors in syslog/dmesg and running a parity check is getting ~150 mb/s across 21 disks. I'm happy with that for sata drives on an expander. Marking as solution.
-
Dual Link to SAS Expander gives link resets: SAS3224 with Super Mei TOP24 Expander
Let me write a miniguide for muppets like me: Download https://www.broadcom.com/site-search?q=Installer_P16_for_Linux and extract it to somewhere you can run things. Confirm the make, model and firmware with: root@beyonder-nas:/mnt/scratch_old/backups/storage# ./sas3flash -list Avago Technologies SAS3 Flash Utility Version 17.00.00.00 (2018.04.02) Copyright 2008-2018 Avago Technologies. All rights reserved. Adapter Selected is a Avago SAS: SAS3224(A1) Controller Number : 0 Controller : SAS3224(A1) PCI Address : 00:02:00:00 SAS Address : 500062b-2-0299-1611 NVDATA Version (Default) : 10.00.00.03 NVDATA Version (Persistent) : 10.00.00.03 Firmware Product ID : 0x2228 (IT) Firmware Version : 16.00.01.00 NVDATA Vendor : LSI NVDATA Product ID : SAS9305-24i BIOS Version : 08.37.00.00 UEFI BSD Version : 18.00.00.00 FCODE Version : N/A Board Name : SAS9305-24i Board Assembly : 03-25699-02004 Board Tracer Number : XW841903CD Finished Processing Commands Successfully. Exiting SAS3Flash. I have a 9305-24i, so need this https://docs.broadcom.com/docs/9305_24i_Pkg_P16.12_IT_FW_BIOS_for_MSDOS_Windows.zip . extract it and look for the firmware, it will be a .bin file. Now we can flash the firmware: root@beyonder-nas:/mnt/scratch_old/backups/storage# ./sas3flash -o -c 00 -f SAS9305_24i_IT_P.bin Avago Technologies SAS3 Flash Utility Version 17.00.00.00 (2018.04.02) Copyright 2008-2018 Avago Technologies. All rights reserved. Advanced Mode Set Adapter Selected is a Avago SAS: SAS3224(A1) Executing Operation: Flash Firmware Image Firmware Image has a Valid Checksum. Firmware Version 16.00.12.00 Firmware Image compatible with Controller. Valid NVDATA Image found. NVDATA Major Version 10.00 Checking for a compatible NVData image... NVDATA Device ID and Chip Revision match verified. NVDATA Versions Compatible. Valid Initialization Image verified. Valid BootLoader Image verified. Beginning Firmware Download... Firmware Download Successful. Verifying Download... Firmware Flash Successful. Resetting Adapter... Adapter Successfully Reset. NVDATA Version 10.00.00.03 Finished Processing Commands Successfully. Exiting SAS3Flash. Finally you can confirm it is flashed: root@beyonder-nas:/mnt/scratch_old/backups/storage# ./sas3flash -list Avago Technologies SAS3 Flash Utility Version 17.00.00.00 (2018.04.02) Copyright 2008-2018 Avago Technologies. All rights reserved. Adapter Selected is a Avago SAS: SAS3224(A1) Controller Number : 0 Controller : SAS3224(A1) PCI Address : 00:02:00:00 SAS Address : 500062b-2-0299-1611 NVDATA Version (Default) : 10.00.00.03 NVDATA Version (Persistent) : 10.00.00.03 Firmware Product ID : 0x2228 (IT) Firmware Version : 16.00.12.00 NVDATA Vendor : LSI NVDATA Product ID : SAS9305-24i BIOS Version : 08.37.00.00 UEFI BSD Version : 18.00.00.00 FCODE Version : N/A Board Name : SAS9305-24i Board Assembly : 03-25699-02004 Board Tracer Number : XW841903CD Finished Processing Commands Successfully. Exiting SAS3Flash. Now i recable and see if this solved my problem
-
Dual Link to SAS Expander gives link resets: SAS3224 with Super Mei TOP24 Expander
Do you know where i could find this? I struggled to find the latest and broadcom site seems to no longer have download for anything but the latest. Edit: nevermind, i found what i need: https://www.broadcom.com/site-search?q=Installer_P16_for_Linux https://docs.broadcom.com/docs/9305_24i_Pkg_P16.12_IT_FW_BIOS_for_MSDOS_Windows.zip run sas3flash -list to confirm the version and for some reason its for MSDOS, you can change the model and 24i bit of the url to get the right package.
-
Dual Link to SAS Expander gives link resets: SAS3224 with Super Mei TOP24 Expander
I have currently cabled my back-plane to single link as i found i received link resets and read/write errors (UDMA CRC errors) if i use dual link. I unfortunately do not have another SAS controller to eliminate it as the cause of the problems, but I have recabled. Currently I am saturating the single channel link when I do parity actions and it tops out at 2.2G. SAS Expander: Super Mei TOP24 Expander 05G451625012 REV CO BLACK 12Gb (there is a sticker saying FW 0a00 2521 V1.00) SAS Controller: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 (rev 01) (during boot it says LSISAS3224: FWVersion(16.00.01.00), ChipRevision(0x01), BiosVersion(18.00.00.00) ) [1000:00c4] 02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 (rev 01) [10:0:20:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdaa 20.0TB [10:0:21:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdab 20.0TB [10:0:22:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdac 20.0TB [10:0:23:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdad 20.0TB [10:0:0:0] disk ATA WDC WD140EMFZ-11 0A81 /dev/sdc 14.0TB [10:0:1:0] disk ATA WDC WD140EMFZ-11 0A81 /dev/sdd 14.0TB [10:0:2:0] disk ATA WDC WD140EMFZ-11 0A81 /dev/sde 14.0TB [10:0:3:0] disk ATA ST12000VN0007-2G SC60 /dev/sdf 12.0TB [10:0:4:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdg 18.0TB [10:0:5:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdi 18.0TB [10:0:6:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdj 18.0TB [10:0:7:0] disk ATA WDC WD140EMFZ-11 0A81 /dev/sdn 14.0TB [10:0:8:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdo 18.0TB [10:0:9:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdp 18.0TB [10:0:10:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdq 18.0TB [10:0:11:0] disk ATA WDC WD180EDGZ-11 0A85 /dev/sdr 18.0TB [10:0:12:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sds 20.0TB [10:0:13:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdt 20.0TB [10:0:14:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdu 20.0TB [10:0:15:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdv 20.0TB [10:0:16:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdw 20.0TB [10:0:17:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdx 20.0TB [10:0:18:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdy 20.0TB [10:0:19:0] disk ATA WDC WD200EDGZ-11 0A85 /dev/sdz 20.0TB All of the diagnostics are from the single link setup, but perhaps there are some hints as to what is the problem. This link looks to be established fine, and you can see the firmware and versions of the controller. [ 36.314490] mpt3sas_cm0: iomem(0x0000000082500000), mapped(0x00000000430be8e9), size(65536) [ 36.316459] mpt3sas_cm0: ioport(0x0000000000004000), size(256) [ 36.341341] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k [ 36.342958] mpt3sas_cm0: sending message unit reset !! [ 36.345780] mpt3sas_cm0: message unit reset: SUCCESS [ 36.348106] mpt3sas_cm0: scatter gather: sge_in_main_msg(9), sge_per_chain(15), sge_per_io(128), chains_per_io(8) [ 36.355313] mpt3sas_cm0: request pool(0x00000000d091fde2) - dma(0x139200000): depth(5852), frame_size(256), pool_size(1463 kB) [ 36.371186] mpt3sas_cm0: sense pool(0x00000000ebd46e7d) - dma(0x13a700000): depth(5791), element_size(96), pool_size (542 kB) [ 36.373700] mpt3sas_cm0: reply pool(0x000000004f53c277) - dma(0x13a800000): depth(5916), frame_size(128), pool_size(739 kB) [ 36.376178] mpt3sas_cm0: config page(0x00000000035a8032) - dma(0x13234d000): size(512) [ 36.378734] mpt3sas_cm0: Allocated physical memory: size(17295 kB) [ 36.381342] mpt3sas_cm0: Current Controller Queue Depth(5788),Max Controller Queue Depth(5824) [ 36.384083] mpt3sas_cm0: Scatter Gather Elements per IO(128) [ 36.525341] mpt3sas_cm0: _base_display_fwpkg_version: complete [ 36.525557] mpt3sas_cm0: overriding NVDATA EEDPTagMode setting [ 36.526162] mpt3sas_cm0: LSISAS3224: FWVersion(16.00.01.00), ChipRevision(0x01), BiosVersion(18.00.00.00) [ 36.526167] mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) [ 36.530365] mpt3sas_cm0: sending port enable !! [ 36.534865] mpt3sas_cm0: hba_port entry: 0000000081440fd9, port: 255 is added to hba_port list [ 36.546212] mpt3sas_cm0: host_add: handle(0x0001), sas_addr(0x500062b202991611), phys(24) [ 36.553485] mpt3sas_cm0: expander_add: handle(0x0019), parent(0x0011), sas_addr(0xcc1be0310015463f), phys(37) [ 36.559900] expander-10:0: add: handle(0x0019), sas_addr(0xcc1be0310015463f) [ 36.559922] mpt3sas_cm0: port enable: SUCCESS [ 36.577354] mpt3sas_cm0: handle(0x1a) sas_address(0xcc1be03100154600) port_type(0x1) [ 36.590904] scsi 10:0:0:0: SATA: handle(0x001a), sas_addr(0xcc1be03100154600), phy(0), device_name(0x5000cca264d70f29) [ 36.644352] end_device-10:0:0: add: handle(0x001a), sas_addr(0xcc1be03100154600) [ 36.644814] mpt3sas_cm0: handle(0x1b) sas_address(0xcc1be03100154601) port_type(0x1) [ 36.651453] scsi 10:0:1:0: SATA: handle(0x001b), sas_addr(0xcc1be03100154601), phy(1), device_name(0x5000cca264df1ce0) [ 36.655977] end_device-10:0:1: add: handle(0x001b), sas_addr(0xcc1be03100154601) [ 36.656265] mpt3sas_cm0: handle(0x1c) sas_address(0xcc1be03100154602) port_type(0x1) [ 36.657287] scsi 10:0:2:0: SATA: handle(0x001c), sas_addr(0xcc1be03100154602), phy(2), device_name(0x5000cca28dc54e4b) [ 36.669233] end_device-10:0:2: add: handle(0x001c), sas_addr(0xcc1be03100154602) [ 36.678566] mpt3sas_cm0: handle(0x24) sas_address(0xcc1be03100154603) port_type(0x1) [ 36.702145] scsi 10:0:3:0: SATA: handle(0x0024), sas_addr(0xcc1be03100154603), phy(3), device_name(0x5000c500a392550d) [ 36.801469] end_device-10:0:3: add: handle(0x0024), sas_addr(0xcc1be03100154603) [ 36.811823] mpt3sas_cm0: handle(0x1d) sas_address(0xcc1be03100154604) port_type(0x1) [ 36.845256] scsi 10:0:4:0: SATA: handle(0x001d), sas_addr(0xcc1be03100154604), phy(4), device_name(0x5000cca2afc6dedc) [ 36.892815] end_device-10:0:4: add: handle(0x001d), sas_addr(0xcc1be03100154604) [ 36.902157] mpt3sas_cm0: handle(0x1e) sas_address(0xcc1be03100154605) port_type(0x1) [ 36.919262] scsi 10:0:5:0: SATA: handle(0x001e), sas_addr(0xcc1be03100154605), phy(5), device_name(0x5000cca284d5623f) [ 36.978479] end_device-10:0:5: add: handle(0x001e), sas_addr(0xcc1be03100154605) [ 37.003230] mpt3sas_cm0: handle(0x1f) sas_address(0xcc1be03100154606) port_type(0x1) [ 37.098613] scsi 10:0:6:0: SATA: handle(0x001f), sas_addr(0xcc1be03100154606), phy(6), device_name(0x5000cca2afc3c57c) [ 37.287258] end_device-10:0:6: add: handle(0x001f), sas_addr(0xcc1be03100154606) [ 37.295286] mpt3sas_cm0: handle(0x20) sas_address(0xcc1be03100154607) port_type(0x1) [ 37.305559] scsi 10:0:7:0: SATA: handle(0x0020), sas_addr(0xcc1be03100154607), phy(7), device_name(0x5000cca264dfc5ee) [ 37.321216] end_device-10:0:7: add: handle(0x0020), sas_addr(0xcc1be03100154607) [ 37.333591] mpt3sas_cm0: handle(0x22) sas_address(0xcc1be03100154608) port_type(0x1) [ 37.350951] scsi 10:0:8:0: SATA: handle(0x0022), sas_addr(0xcc1be03100154608), phy(8), device_name(0x5000cca284ef2699) [ 37.515223] end_device-10:0:8: add: handle(0x0022), sas_addr(0xcc1be03100154608) [ 37.520455] mpt3sas_cm0: handle(0x23) sas_address(0xcc1be03100154609) port_type(0x1) [ 37.544456] scsi 10:0:9:0: SATA: handle(0x0023), sas_addr(0xcc1be03100154609), phy(9), device_name(0x5000cca2afc510fb) [ 37.582514] end_device-10:0:9: add: handle(0x0023), sas_addr(0xcc1be03100154609) [ 37.591047] mpt3sas_cm0: handle(0x25) sas_address(0xcc1be0310015460a) port_type(0x1) [ 37.607455] scsi 10:0:10:0: SATA: handle(0x0025), sas_addr(0xcc1be0310015460a), phy(10), device_name(0x5000cca2afc41fea) [ 37.641779] end_device-10:0:10: add: handle(0x0025), sas_addr(0xcc1be0310015460a) [ 37.650665] mpt3sas_cm0: handle(0x26) sas_address(0xcc1be0310015460b) port_type(0x1) [ 37.667995] scsi 10:0:11:0: SATA: handle(0x0026), sas_addr(0xcc1be0310015460b), phy(11), device_name(0x5000cca284d4e2cf) [ 37.716350] end_device-10:0:11: add: handle(0x0026), sas_addr(0xcc1be0310015460b) [ 37.725195] mpt3sas_cm0: handle(0x27) sas_address(0xcc1be0310015460c) port_type(0x1) [ 37.747290] scsi 10:0:12:0: SATA: handle(0x0027), sas_addr(0xcc1be0310015460c), phy(12), device_name(0x5000cca2bec4ff5b) [ 37.849489] end_device-10:0:12: add: handle(0x0027), sas_addr(0xcc1be0310015460c) [ 37.849629] mpt3sas_cm0: handle(0x28) sas_address(0xcc1be0310015460d) port_type(0x1) [ 37.857081] scsi 10:0:13:0: SATA: handle(0x0028), sas_addr(0xcc1be0310015460d), phy(13), device_name(0x5000cca407e2e65a) [ 37.935803] end_device-10:0:13: add: handle(0x0028), sas_addr(0xcc1be0310015460d) [ 37.936005] mpt3sas_cm0: handle(0x29) sas_address(0xcc1be0310015460e) port_type(0x1) [ 37.943492] scsi 10:0:14:0: SATA: handle(0x0029), sas_addr(0xcc1be0310015460e), phy(14), device_name(0x5000cca2c0c512a2) [ 38.022516] end_device-10:0:14: add: handle(0x0029), sas_addr(0xcc1be0310015460e) [ 38.022733] mpt3sas_cm0: handle(0x2a) sas_address(0xcc1be0310015460f) port_type(0x1) [ 38.031128] scsi 10:0:15:0: SATA: handle(0x002a), sas_addr(0xcc1be0310015460f), phy(15), device_name(0x5000cca2bfc040d3) [ 38.115380] end_device-10:0:15: add: handle(0x002a), sas_addr(0xcc1be0310015460f) [ 38.116866] mpt3sas_cm0: handle(0x2b) sas_address(0xcc1be03100154610) port_type(0x1) [ 38.126201] scsi 10:0:16:0: SATA: handle(0x002b), sas_addr(0xcc1be03100154610), phy(16), device_name(0x5000cca2b3c380e6) [ 38.205629] end_device-10:0:16: add: handle(0x002b), sas_addr(0xcc1be03100154610) [ 38.206807] mpt3sas_cm0: handle(0x2c) sas_address(0xcc1be03100154611) port_type(0x1) [ 38.215695] scsi 10:0:17:0: SATA: handle(0x002c), sas_addr(0xcc1be03100154611), phy(17), device_name(0x5000cca2b3c34289) [ 38.293204] end_device-10:0:17: add: handle(0x002c), sas_addr(0xcc1be03100154611) [ 38.294389] mpt3sas_cm0: handle(0x2d) sas_address(0xcc1be03100154612) port_type(0x1) [ 38.304437] scsi 10:0:18:0: SATA: handle(0x002d), sas_addr(0xcc1be03100154612), phy(18), device_name(0x5000cca2becd9d1c) [ 38.389970] end_device-10:0:18: add: handle(0x002d), sas_addr(0xcc1be03100154612) [ 38.391154] mpt3sas_cm0: handle(0x2e) sas_address(0xcc1be03100154613) port_type(0x1) [ 38.401483] scsi 10:0:19:0: SATA: handle(0x002e), sas_addr(0xcc1be03100154613), phy(19), device_name(0x5000cca2bec8c3a3) [ 38.491219] end_device-10:0:19: add: handle(0x002e), sas_addr(0xcc1be03100154613) [ 38.492407] mpt3sas_cm0: handle(0x2f) sas_address(0xcc1be03100154614) port_type(0x1) [ 38.501773] scsi 10:0:20:0: SATA: handle(0x002f), sas_addr(0xcc1be03100154614), phy(20), device_name(0x5000cca2b3c4afc3) [ 38.583468] end_device-10:0:20: add: handle(0x002f), sas_addr(0xcc1be03100154614) [ 38.585420] mpt3sas_cm0: handle(0x30) sas_address(0xcc1be03100154615) port_type(0x1) [ 38.596504] scsi 10:0:21:0: SATA: handle(0x0030), sas_addr(0xcc1be03100154615), phy(21), device_name(0x5000cca2b3c35cc0) [ 38.679991] end_device-10:0:21: add: handle(0x0030), sas_addr(0xcc1be03100154615) [ 38.682711] mpt3sas_cm0: handle(0x31) sas_address(0xcc1be03100154616) port_type(0x1) [ 38.692968] scsi 10:0:22:0: SATA: handle(0x0031), sas_addr(0xcc1be03100154616), phy(22), device_name(0x5000cca2b3c35804) [ 38.773938] end_device-10:0:22: add: handle(0x0031), sas_addr(0xcc1be03100154616) [ 38.779631] mpt3sas_cm0: handle(0x32) sas_address(0xcc1be03100154617) port_type(0x1) [ 38.789334] scsi 10:0:23:0: SATA: handle(0x0032), sas_addr(0xcc1be03100154617), phy(23), device_name(0x5000cca2bec2f265) [ 38.880027] end_device-10:0:23: add: handle(0x0032), sas_addr(0xcc1be03100154617) [ 38.884685] mpt3sas_cm0: handle(0x21) sas_address(0xcc1be0310015463d) port_type(0x1) [ 38.901648] scsi 10:0:24:0: SES: handle(0x0021), sas_addr(0xcc1be0310015463d), phy(36), device_name(0xcc1be0310015463d) [ 38.919628] end_device-10:0:24: add: handle(0x0021), sas_addr(0xcc1be0310015463d) I was unable to find either the firmware for the expander nor the firmware for the controller. I am attaching diagnostics from the SAS controller in single link to the expander and running a parity check, as i didn't want to risk any more write errors. When it was in dual link mode i would get a flood of the following messages and an increase in UDMA CRC errors: mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) mpt3sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000) beyonder-nas-diagnostics-20250314-1045.zip
-
[Plugin] Parity Check Tuning
This is a great plugin, I had a dodgy cable and needed to pause a data rebuild. Saved me hours of time I didn't have to be able to pause, reboot and continue. This should definitely get incorporated into mainline unraid.
-
Kernel issues, Bugs can't run parity
Switched out the 13900k for a 14900k, same power, same motherboard and all cores enabled. worked fine. 13900k was trash and had a broken core, returning it to amazon as Intel are horrible to deal with.
-
[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...
Okay for all the muppets out there, be aware firmware/bios updates and resetting actually does indeed reset the settings, including not just the stuff you are rabbit holing on fixing. tl;dr bios update reset the igpu, so it was set to auto and passing through ASPEED adapter instead of the actual intel iGPU
-
[Support] ich777 - AMD Vendor Reset, CoralTPU, hpsahba,...
This used to work but suddenly stopped (along with plex hardware transcoding), any ideas? root@beyonder-nas:~# intel_gpu_top No device filter specified and no discrete/integrated i915 devices found root@beyonder-nas:~# uname -a Linux beyonder-nas 6.1.126-Unraid #1 SMP PREEMPT_DYNAMIC Sun Jan 19 15:51:34 PST 2025 x86_64 13th Gen Intel(R) Core(TM) i9-13900K GenuineIntel GNU/Linux root@beyonder-nas:~# cat /proc/cmdline BOOT_IMAGE=/bzimage i915.enable_fbc=1 i915.enable_guc=2 nvme_core.default_ps_max_latency_us=0 iommu=pt initrd=/bzroot root@beyonder-nas:~# cat /etc/modprobe.d/i915.conf options i915 enable_fbc=1 enable_guc=2 root@beyonder-nas:~# lspci 00:00.0 Host bridge: Intel Corporation Device a700 (rev 01) 00:01.0 PCI bridge: Intel Corporation Raptor Lake PCI Express 5.0 Graphics Port (PEG010) (rev 01) 00:01.1 PCI bridge: Intel Corporation Device a72d (rev 01) 00:06.0 PCI bridge: Intel Corporation Raptor Lake PCIe 4.0 Graphics Port (rev 01) 00:0a.0 Signal processing controller: Intel Corporation Raptor Lake Crashlog and Telemetry (rev 01) 00:14.0 USB controller: Intel Corporation Alder Lake-S PCH USB 3.2 Gen 2x2 XHCI Controller (rev 11) 00:14.2 RAM memory: Intel Corporation Alder Lake-S PCH Shared SRAM (rev 11) 00:15.0 Serial bus controller: Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #0 (rev 11) 00:15.1 Serial bus controller: Intel Corporation Alder Lake-S PCH Serial IO I2C Controller #1 (rev 11) 00:16.0 Communication controller: Intel Corporation Alder Lake-S PCH HECI Controller #1 (rev 11) 00:16.3 Serial controller: Intel Corporation Alder Lake-S Keyboard and Text (KT) Redirection (rev 11) 00:17.0 SATA controller: Intel Corporation Alder Lake-S PCH SATA Controller [AHCI Mode] (rev 11) 00:1a.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #25 (rev 11) 00:1b.0 PCI bridge: Intel Corporation Device 7ac0 (rev 11) 00:1b.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #21 (rev 11) 00:1c.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #1 (rev 11) 00:1c.1 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #2 (rev 11) 00:1c.3 PCI bridge: Intel Corporation Device 7abb (rev 11) 00:1c.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #5 (rev 11) 00:1d.0 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port #9 (rev 11) 00:1f.0 ISA bridge: Intel Corporation Device 7a88 (rev 11) 00:1f.3 Audio device: Intel Corporation Alder Lake-S HD Audio Controller (rev 11) 00:1f.4 SMBus: Intel Corporation Alder Lake-S PCH SMBus Controller (rev 11) 00:1f.5 Serial bus controller: Intel Corporation Alder Lake-S PCH SPI Controller (rev 11) 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (17) I219-LM (rev 11) 01:00.0 RAID bus controller: Broadcom / LSI Fusion-MPT 24GSAS/PCIe SAS40xx/41xx (rev 01) 02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 (rev 01) 03:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN850X NVMe SSD (rev 01) 04:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN850X NVMe SSD (rev 01) 06:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN850X NVMe SSD (rev 01) 07:00.0 PCI bridge: Integrated Technology Express, Inc. IT8893E PCIe to PCI Bridge (rev 41) 09:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-LM (rev 03) 0a:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06) 0b:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) 0c:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO 0d:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
-
Kernel issues, Bugs can't run parity
28% on a parity done with 2 p-cores and 4 e-cores so looking like a faulty core. I will see if this parity completes, then try increase the number of cores and repeat. RMA for intel I guess.
-
Kernel issues, Bugs can't run parity
I disabled all but 2 P-cores and 4 E-cores and currently at 6.9% parity, which is further than its got with any other test. Lets see how it goes, but broken intel CPU core you think or still power?
-
Kernel issues, Bugs can't run parity
So I'm at a loss: CPU is stable under duress (x265, ng-stress) Memory passes mem test, and is ECC so should be logging if there was bitflipping Bios/motherboard on multiple versions disk read/write seems to be fine... i'm able to transcode and play direct docker is fine, only the main unraid thread crashes parity is under less load than the CPU stress No power events in ipmi or kernel logs I was sure it would be the processor but seems only parity and its got 100% reproduce on that between 1-5% completed. Could it be LSI SAS controller, usb drive or something else and can they be isolated for testing without physical access to the server? Is there a parity like test i can run to simulate a failure?
-
Kernel issues, Bugs can't run parity
No, all the original cables and into a sas backplane. The sas backplane isn't super good though. The read/writes to the drives seem to be fine. Unfortunately at 3.8%, its happened again: [ 4079.681264] ------------[ cut here ]------------ [ 4079.681274] kernel BUG at drivers/md/unraid.c:1617! [ 4079.681812] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 4079.682290] CPU: 8 PID: 15720 Comm: unraidd0 Tainted: P O 6.1.99-Unraid #1 [ 4079.682796] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 4.1 10/01/2024 [ 4079.683290] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 4079.683791] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 e3 7c a0 48 8b 73 20 e8 b2 b7 09 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 4079.684820] RSP: 0018:ffffc900205efdf0 EFLAGS: 00010246 [ 4079.685333] RAX: 0000000000000000 RBX: ffff88814a680da8 RCX: 0000000000000000 [ 4079.685854] RDX: 0000000000000000 RSI: ffffffff829ee720 RDI: ffff888108624038 [ 4079.686377] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 4079.686893] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888148f62110 [ 4079.687398] R13: ffff88814a680fa0 R14: ffff88814a681018 R15: ffff8881049152d8 [ 4079.687899] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 4079.688404] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4079.688903] CR2: 000015173aee8000 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 4079.689404] PKRU: 55555554 [ 4079.689896] Call Trace: [ 4079.690382] <TASK> [ 4079.690860] ? __die_body+0x1a/0x5c [ 4079.691340] ? die+0x30/0x49 [ 4079.691806] ? do_trap+0x7b/0xfe [ 4079.692274] ? unraidd+0x1051/0x1140 [md_mod] [ 4079.692729] ? unraidd+0x1051/0x1140 [md_mod] [ 4079.693171] ? do_error_trap+0x6e/0x98 [ 4079.693607] ? unraidd+0x1051/0x1140 [md_mod] [ 4079.694044] ? exc_invalid_op+0x4c/0x60 [ 4079.694482] ? unraidd+0x1051/0x1140 [md_mod] [ 4079.694904] ? asm_exc_invalid_op+0x16/0x20 [ 4079.695323] ? unraidd+0x1051/0x1140 [md_mod] [ 4079.695731] md_thread+0xf4/0x122 [md_mod] [ 4079.696132] ? _raw_spin_rq_lock_irqsave+0x20/0x20 [ 4079.696531] ? signal_pending+0x1d/0x1d [md_mod] [ 4079.696922] kthread+0xe4/0xef [ 4079.697317] ? kthread_complete_and_exit+0x1b/0x1b [ 4079.697719] ret_from_fork+0x1f/0x30 [ 4079.698109] </TASK> [ 4079.698487] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp zfs(PO) kvm_intel kvm ast crct10dif_pclmul drm_vram_helper crc32_pclmul crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_ttm_helper zunicode(PO) sha512_ssse3 sha256_ssse3 ttm zzstd(O) sha1_ssse3 drm_kms_helper zlua(O) aesni_intel zavl(PO) drm crypto_simd icp(PO) agpgart mei_pxp mei_hdcp syscopyarea i2c_i801 cryptd [ 4079.698517] zcommon(PO) rapl znvpair(PO) intel_cstate spl(O) ipmi_ssif wmi_bmof mpt3sas intel_uncore sysfillrect i2c_smbus mei_me nvme input_leds mpi3mr sysimgblt video ahci raid_class acpi_ipmi i2c_core joydev led_class fb_sys_fops mei nvme_core libahci scsi_transport_sas thermal fan wmi ipmi_si backlight intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 4079.705481] ---[ end trace 0000000000000000 ]--- [ 4079.745331] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 4079.745970] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 e3 7c a0 48 8b 73 20 e8 b2 b7 09 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 4079.747220] RSP: 0018:ffffc900205efdf0 EFLAGS: 00010246 [ 4079.747846] RAX: 0000000000000000 RBX: ffff88814a680da8 RCX: 0000000000000000 [ 4079.748451] RDX: 0000000000000000 RSI: ffffffff829ee720 RDI: ffff888108624038 [ 4079.749071] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 4079.749694] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888148f62110 [ 4079.750313] R13: ffff88814a680fa0 R14: ffff88814a681018 R15: ffff8881049152d8 [ 4079.750899] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 4079.751505] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4079.752120] CR2: 000015173aee8000 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 4079.752723] PKRU: 55555554 [ 4079.753339] ------------[ cut here ]------------ [ 4079.753955] WARNING: CPU: 8 PID: 15720 at kernel/exit.c:816 do_exit+0x87/0x923 [ 4079.754582] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp zfs(PO) kvm_intel kvm ast crct10dif_pclmul drm_vram_helper crc32_pclmul crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_ttm_helper zunicode(PO) sha512_ssse3 sha256_ssse3 ttm zzstd(O) sha1_ssse3 drm_kms_helper zlua(O) aesni_intel zavl(PO) drm crypto_simd icp(PO) agpgart mei_pxp mei_hdcp syscopyarea i2c_i801 cryptd [ 4079.754606] zcommon(PO) rapl znvpair(PO) intel_cstate spl(O) ipmi_ssif wmi_bmof mpt3sas intel_uncore sysfillrect i2c_smbus mei_me nvme input_leds mpi3mr sysimgblt video ahci raid_class acpi_ipmi i2c_core joydev led_class fb_sys_fops mei nvme_core libahci scsi_transport_sas thermal fan wmi ipmi_si backlight intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 4079.763171] CPU: 8 PID: 15720 Comm: unraidd0 Tainted: P D O 6.1.99-Unraid #1 [ 4079.764063] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 4.1 10/01/2024 [ 4079.764958] RIP: 0010:do_exit+0x87/0x923 [ 4079.765855] Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 d1 2b 81 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 d3 2a 81 00 48 8b 83 d0 06 00 00 83 [ 4079.767710] RSP: 0018:ffffc900205efee0 EFLAGS: 00010286 [ 4079.768635] RAX: 0000000000000000 RBX: ffff888109c4f000 RCX: 0000000000000000 [ 4079.769571] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff [ 4079.770509] RBP: 000000000000000b R08: 0000000000000000 R09: ffffc900037e5020 [ 4079.771442] R10: 0000000000aaaaaa R11: 0000000000000001 R12: ffff888104636c00 [ 4079.772362] R13: ffff88814822dac0 R14: 0000000000000002 R15: ffffffff820b2ea5 [ 4079.773257] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 4079.774142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4079.775011] CR2: 000015173aee8000 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 4079.775867] PKRU: 55555554 [ 4079.776700] Call Trace: [ 4079.777510] <TASK> [ 4079.778290] ? __warn+0xab/0x122 [ 4079.779051] ? report_bug+0x109/0x17e [ 4079.779793] ? do_exit+0x87/0x923 [ 4079.780513] ? handle_bug+0x41/0x6f [ 4079.781215] ? exc_invalid_op+0x13/0x60 [ 4079.781914] ? asm_exc_invalid_op+0x16/0x20 [ 4079.782601] ? do_exit+0x87/0x923 [ 4079.783272] make_task_dead+0x11c/0x11c [ 4079.783934] rewind_stack_and_make_dead+0x17/0x17 [ 4079.784598] RIP: 0000:0x0 [ 4079.785257] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 4079.785925] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 4079.786611] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 4079.787290] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 4079.787953] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 4079.788609] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 4079.789252] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 4079.789882] </TASK> 4.1 bios didn't help. No other errors but the parity is stuck/crashed and i can only reboot. The system hasn't hung/dockers are working fine.
-
Kernel issues, Bugs can't run parity
I think the power draw is the same/higher with a x265 encode (~450w), but I guess a parity check would be more lines (LSI card, all the drives, cpu, etc.) and could be susceptible to that. Its a Corsair 1000w platinum so I'd pretty shuck with my view of corsair and their PSUs if its this, they've been really solid for me even in server builds. I do have a replacement PSU spare but don't have access to the server until march to switch it. I kicked off a new parity check with the updated bios, 4.1, and its got to 2.5% (not out of the woods by a long shot but further than it got the last two times). UPS load is currently ~ 360w and CPU isn't running as hot. Let's hope its a motherboard / bios issue with 3.3b. Really wish Supermicro would provide release notes
-
Kernel issues, Bugs can't run parity
4k x265 encode on placebo and solid, memory test passed. I've noticed there's a new bios (3.3b > 4.1) so I'll try that next. No actual release notes from supermicro so other than the intel microcode, not sure what they've done. Seems only parity check is a problem so far from stress tests.
-
Kernel issues, Bugs can't run parity
Any way to narrow it down? CPU was the obvious one given the press but i'm unable to repro without a parity check. I will try a x265 encode today. Could this be caused by a faulty backplane or SAS Controller?
-
Kernel issues, Bugs can't run parity
Ran 4x transcodes, and saturating the cpu with stress-ng with 100GB of ram usage. No issues. I rolled back to 6.12.11 now and trying a parity check. edit: [ 494.598353] ------------[ cut here ]------------ [ 494.598363] kernel BUG at drivers/md/unraid.c:1617! [ 494.598915] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 494.599427] CPU: 8 PID: 14798 Comm: unraidd0 Tainted: P O 6.1.99-Unraid #1 [ 494.599917] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 494.600408] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 494.600905] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 93 71 a0 48 8b 73 20 e8 b2 07 15 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 494.601952] RSP: 0018:ffffc900018efdf0 EFLAGS: 00010246 [ 494.602469] RAX: 0000000000000000 RBX: ffff88814a25a8d8 RCX: 0000000000000000 [ 494.602993] RDX: 0000000000000000 RSI: ffffffff829ee720 RDI: ffff8881099a6238 [ 494.603511] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 494.604026] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881531cc110 [ 494.604532] R13: ffff88814a25aad0 R14: ffff88814a25ab48 R15: ffff88814ae352d8 [ 494.605038] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 494.605545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 494.606049] CR2: 000015235ec7269c CR3: 000000000420a004 CR4: 0000000000770ee0 [ 494.606556] PKRU: 55555554 [ 494.607052] Call Trace: [ 494.607539] <TASK> [ 494.608021] ? __die_body+0x1a/0x5c [ 494.608502] ? die+0x30/0x49 [ 494.608964] ? do_trap+0x7b/0xfe [ 494.609416] ? unraidd+0x1051/0x1140 [md_mod] [ 494.609868] ? unraidd+0x1051/0x1140 [md_mod] [ 494.610310] ? do_error_trap+0x6e/0x98 [ 494.610752] ? unraidd+0x1051/0x1140 [md_mod] [ 494.611195] ? exc_invalid_op+0x4c/0x60 [ 494.611630] ? unraidd+0x1051/0x1140 [md_mod] [ 494.612062] ? asm_exc_invalid_op+0x16/0x20 [ 494.612478] ? unraidd+0x1051/0x1140 [md_mod] [ 494.612888] md_thread+0xf4/0x122 [md_mod] [ 494.613292] ? _raw_spin_rq_lock_irqsave+0x20/0x20 [ 494.613694] ? signal_pending+0x1d/0x1d [md_mod] [ 494.614090] kthread+0xe4/0xef [ 494.614485] ? kthread_complete_and_exit+0x1b/0x1b [ 494.614883] ret_from_fork+0x1f/0x30 [ 494.615276] </TASK> [ 494.615655] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls e1000e igc intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp zfs(PO) coretemp kvm_intel zunicode(PO) zzstd(O) kvm ast drm_vram_helper i2c_algo_bit drm_ttm_helper zlua(O) ttm zavl(PO) drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel icp(PO) sha512_ssse3 sha256_ssse3 sha1_ssse3 drm aesni_intel crypto_simd zcommon(PO) znvpair(PO) agpgart i2c_i801 mei_hdcp [ 494.615685] mei_pxp cryptd rapl spl(O) wmi_bmof ipmi_ssif intel_cstate mpt3sas intel_uncore syscopyarea i2c_smbus mei_me raid_class sysfillrect mpi3mr video ahci nvme input_leds sysimgblt acpi_ipmi i2c_core joydev mei fb_sys_fops led_class thermal nvme_core libahci scsi_transport_sas fan wmi ipmi_si backlight acpi_pad acpi_tad intel_pmc_core button unix [last unloaded: e1000e] [ 494.621395] ---[ end trace 0000000000000000 ]--- [ 495.257484] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 495.258185] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 93 71 a0 48 8b 73 20 e8 b2 07 15 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 495.259585] RSP: 0018:ffffc900018efdf0 EFLAGS: 00010246 [ 495.260268] RAX: 0000000000000000 RBX: ffff88814a25a8d8 RCX: 0000000000000000 [ 495.260909] RDX: 0000000000000000 RSI: ffffffff829ee720 RDI: ffff8881099a6238 [ 495.261562] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 495.262201] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881531cc110 [ 495.262842] R13: ffff88814a25aad0 R14: ffff88814a25ab48 R15: ffff88814ae352d8 [ 495.263483] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 495.264126] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 495.264782] CR2: 000015235ec7269c CR3: 000000000420a004 CR4: 0000000000770ee0 [ 495.265440] PKRU: 55555554 [ 495.266076] ------------[ cut here ]------------ [ 495.266695] WARNING: CPU: 8 PID: 14798 at kernel/exit.c:816 do_exit+0x87/0x923 [ 495.267349] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls e1000e igc intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp zfs(PO) coretemp kvm_intel zunicode(PO) zzstd(O) kvm ast drm_vram_helper i2c_algo_bit drm_ttm_helper zlua(O) ttm zavl(PO) drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel icp(PO) sha512_ssse3 sha256_ssse3 sha1_ssse3 drm aesni_intel crypto_simd zcommon(PO) znvpair(PO) agpgart i2c_i801 mei_hdcp [ 495.267379] mei_pxp cryptd rapl spl(O) wmi_bmof ipmi_ssif intel_cstate mpt3sas intel_uncore syscopyarea i2c_smbus mei_me raid_class sysfillrect mpi3mr video ahci nvme input_leds sysimgblt acpi_ipmi i2c_core joydev mei fb_sys_fops led_class thermal nvme_core libahci scsi_transport_sas fan wmi ipmi_si backlight acpi_pad acpi_tad intel_pmc_core button unix [last unloaded: e1000e] [ 495.276096] CPU: 8 PID: 14798 Comm: unraidd0 Tainted: P D O 6.1.99-Unraid #1 [ 495.277044] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 495.277992] RIP: 0010:do_exit+0x87/0x923 [ 495.278948] Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 d1 2b 81 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 d3 2a 81 00 48 8b 83 d0 06 00 00 83 [ 495.280879] RSP: 0018:ffffc900018efee0 EFLAGS: 00010286 [ 495.281871] RAX: 0000000000000000 RBX: ffff88812cfe7000 RCX: 0000000000000000 [ 495.282805] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff [ 495.283794] RBP: 000000000000000b R08: 0000000000000000 R09: ffffc90002ab2020 [ 495.284745] R10: 0000000000aaaaaa R11: 0000000000000001 R12: ffff8881343b0800 [ 495.285706] R13: ffff88812f006b40 R14: 0000000000000002 R15: ffffffff820b2ea5 [ 495.286659] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 495.287592] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 495.288518] CR2: 000015235ec7269c CR3: 000000000420a004 CR4: 0000000000770ee0 [ 495.289413] PKRU: 55555554 [ 495.290287] Call Trace: [ 495.291115] <TASK> [ 495.291904] ? __warn+0xab/0x122 [ 495.292687] ? report_bug+0x109/0x17e [ 495.293472] ? do_exit+0x87/0x923 [ 495.294241] ? handle_bug+0x41/0x6f [ 495.294986] ? exc_invalid_op+0x13/0x60 [ 495.295742] ? asm_exc_invalid_op+0x16/0x20 [ 495.296446] ? do_exit+0x87/0x923 [ 495.297145] make_task_dead+0x11c/0x11c [ 495.297851] rewind_stack_and_make_dead+0x17/0x17 [ 495.298554] RIP: 0000:0x0 [ 495.299261] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 495.299950] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 495.300670] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 495.301385] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 495.302093] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 495.302792] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 495.303462] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 495.304135] </TASK> [ 495.304791] ---[ end trace 0000000000000000 ]--- Same on that version too.
-
Kernel issues, Bugs can't run parity
I haven't rolled back that far yet. I actually upgraded to 7.0.0 as part of the testing and running on that atm. When stress-ng finishes, assuming it doesn't show a cpu issue, i'll try go further back and run a parity check. To hedge my bets i've opened an RMA case with Intel in parallel. edit: first round of stress-ng has no issues. I will do 4-5x more and if it passes, I'm not sure this is the CPU unless there's something else i can do to stress it more?
-
Kernel issues, Bugs can't run parity
OIkay, so for the CPU, I'm currently running: docker run -t --rm polinux/stress-ng --cpu 20 --cpu-method fft --timeout 30m Reading reddit that seems like the way to force a crash. You think that's enough to prove it out? Memory, I'm using kingston ECC memory and my understanding is if that was a problem I would be getting errors in syslog about memory corrections? I didn't see anything. Is the fact i'm using ECC that is not overclocked and from a kit enough to eliminate that? I've got 128GB so a memtest8 will take a long time. Could this be caused by an LSI controller? I have not seen any issues with reads or writes (no smart errors, etc) on a daily basis, other than parity just stopping but that seems to be related to this error. The parity check dies really early on then drops to 256kb/s on the UI, but has zero activity on the array which is odd, and only a hard reboot gets the machine back. I do occasionally see this: [20606.488498] traps: .NET ThreadPool[2356421] general protection fault ip:149489986585 sp:1493edb36b08 error:0 in ld-musl-x86_64.so.1[149489959000+57000] [23215.145055] .NET ThreadPool[2683605]: segfault at 1522ad06bc60 ip 00001522ad06bc60 sp 00001522abfa8078 error 15 likely on CPU 8 (core 16, socket 0) [23215.145061] Code: 00 00 48 df 06 ad 22 15 00 00 98 e8 08 ad 22 15 00 00 20 bc 06 ad 22 15 00 00 00 00 00 00 00 00 00 00 e8 e8 08 ad 22 15 00 00 <c8> bb 06 ad 22 15 00 00 88 77 02 ad 22 15 00 00 90 b9 06 ad 22 15 but i believe this is Radarr.
-
Kernel issues, Bugs can't run parity
No luck, same issue: [ 272.118236] ------------[ cut here ]------------ [ 272.118238] kernel BUG at drivers/md/unraid.c:1617! [ 272.118643] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 272.119013] CPU: 8 PID: 11802 Comm: unraidd0 Tainted: P O 6.6.68-Unraid #1 [ 272.119392] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 272.119775] RIP: 0010:unraidd+0x1189/0x1278 [md_mod] [ 272.120153] Code: 00 83 3d 01 80 fb ff 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 b3 ac a0 48 8b 73 20 e8 7b c4 5c e0 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 18 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 272.120925] RSP: 0018:ffffc900014cbda8 EFLAGS: 00010246 [ 272.121300] RAX: 0000000000000000 RBX: ffff88816c655f38 RCX: 0000000000000000 [ 272.121680] RDX: 0000000000000000 RSI: ffffffff82cb9420 RDI: ffff888108682438 [ 272.122060] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 [ 272.122439] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88815653c928 [ 272.122813] R13: ffff88816c656340 R14: ffff88816c6563b8 R15: ffff888144431540 [ 272.123184] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 272.123560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.123923] CR2: 000015112a864fd8 CR3: 0000000005416000 CR4: 0000000000750ee0 [ 272.124289] PKRU: 55555554 [ 272.124650] Call Trace: [ 272.124997] <TASK> [ 272.125344] ? __die_body+0x1a/0x5c [ 272.125693] ? die+0x30/0x49 [ 272.126027] ? do_trap+0x7b/0xfe [ 272.126357] ? unraidd+0x1189/0x1278 [md_mod] [ 272.126684] ? unraidd+0x1189/0x1278 [md_mod] [ 272.126995] ? do_error_trap+0x6e/0x98 [ 272.127308] ? unraidd+0x1189/0x1278 [md_mod] [ 272.127624] ? exc_invalid_op+0x4c/0x60 [ 272.127937] ? unraidd+0x1189/0x1278 [md_mod] [ 272.128247] ? asm_exc_invalid_op+0x16/0x20 [ 272.128548] ? unraidd+0x1189/0x1278 [md_mod] [ 272.128847] ? unraidd+0x1159/0x1278 [md_mod] [ 272.129137] ? preempt_latency_start+0x2b/0x46 [ 272.129420] ? preempt_latency_start+0x2b/0x46 [ 272.129693] md_thread+0xf7/0x127 [md_mod] [ 272.129968] ? __pfx_autoremove_wake_function+0x10/0x10 [ 272.130246] ? __pfx_md_thread+0x10/0x10 [md_mod] [ 272.130526] kthread+0xf1/0xfc [ 272.130801] ? __pfx_kthread+0x10/0x10 [ 272.131063] ret_from_fork+0x21/0x36 [ 272.131329] ? __pfx_kthread+0x10/0x10 [ 272.131594] ret_from_fork_asm+0x1b/0x30 [ 272.131860] </TASK> [ 272.132117] Modules linked in: ipmi_devintf md_mod i915 drm_buddy ttm drm_display_helper intel_gtt agpgart iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel ast crypto_simd drm_shmem_helper cryptd i2c_algo_bit drm_kms_helper mei_pxp mei_hdcp ipmi_ssif zfs(PO) i2c_i801 rapl intel_cstate acpi_ipmi mei_me i2c_smbus drm spl(O) video ahci input_leds wmi_bmof intel_uncore led_class i2c_core mei ipmi_si libahci joydev backlight wmi thermal acpi_pad fan acpi_tad nvme mpt3sas mpi3mr nvme_core raid_class button [ 272.132154] scsi_transport_sas [last unloaded: igc] [ 272.135364] ---[ end trace 0000000000000000 ]--- [ 272.778589] pstore: backend (erst) writing error (-28) [ 272.778977] RIP: 0010:unraidd+0x1189/0x1278 [md_mod] [ 272.779356] Code: 00 83 3d 01 80 fb ff 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 b3 ac a0 48 8b 73 20 e8 7b c4 5c e0 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 18 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 272.780149] RSP: 0018:ffffc900014cbda8 EFLAGS: 00010246 [ 272.780524] RAX: 0000000000000000 RBX: ffff88816c655f38 RCX: 0000000000000000 [ 272.780913] RDX: 0000000000000000 RSI: ffffffff82cb9420 RDI: ffff888108682438 [ 272.781321] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 [ 272.781702] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88815653c928 [ 272.782080] R13: ffff88816c656340 R14: ffff88816c6563b8 R15: ffff888144431540 [ 272.782450] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 272.782826] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.783208] CR2: 000015112a864fd8 CR3: 0000000005416000 CR4: 0000000000750ee0 [ 272.783589] PKRU: 55555554 [ 272.783969] ------------[ cut here ]------------ [ 272.784350] WARNING: CPU: 8 PID: 11802 at kernel/exit.c:820 do_exit+0x81/0x90b [ 272.784747] Modules linked in: ipmi_devintf md_mod i915 drm_buddy ttm drm_display_helper intel_gtt agpgart iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel ast crypto_simd drm_shmem_helper cryptd i2c_algo_bit drm_kms_helper mei_pxp mei_hdcp ipmi_ssif zfs(PO) i2c_i801 rapl intel_cstate acpi_ipmi mei_me i2c_smbus drm spl(O) video ahci input_leds wmi_bmof intel_uncore led_class i2c_core mei ipmi_si libahci joydev backlight wmi thermal acpi_pad fan acpi_tad nvme mpt3sas mpi3mr nvme_core raid_class button [ 272.784778] scsi_transport_sas [last unloaded: igc] [ 272.789193] CPU: 8 PID: 11802 Comm: unraidd0 Tainted: P D O 6.6.68-Unraid #1 [ 272.789729] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 272.790271] RIP: 0010:do_exit+0x81/0x90b [ 272.790814] Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 a2 e9 9b 00 48 83 bb c0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 6c e8 9b 00 48 8b 83 d0 06 00 00 83 [ 272.791955] RSP: 0018:ffffc900014cbee0 EFLAGS: 00010286 [ 272.792499] RAX: 0000000000000000 RBX: ffff8881090f7000 RCX: 0000000000000000 [ 272.793060] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff [ 272.793614] RBP: 000000000000000b R08: 0000000000000000 R09: ffff888146ed7800 [ 272.794171] R10: 0000000000000001 R11: ffffc90002a00020 R12: ffff888106024400 [ 272.794721] R13: ffff888108e70000 R14: 0000000000000002 R15: ffffffff8222702e [ 272.795292] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 272.795862] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.796436] CR2: 000015112a864fd8 CR3: 0000000005416000 CR4: 0000000000750ee0 [ 272.796990] PKRU: 55555554 [ 272.797518] Call Trace: [ 272.798020] <TASK> [ 272.798511] ? __warn+0x99/0x11a [ 272.798996] ? report_bug+0xd9/0x153 [ 272.799459] ? do_exit+0x81/0x90b [ 272.799910] ? handle_bug+0x53/0x7c [ 272.800358] ? exc_invalid_op+0x13/0x60 [ 272.800796] ? asm_exc_invalid_op+0x16/0x20 [ 272.801241] ? do_exit+0x81/0x90b [ 272.801673] ? __pfx_md_thread+0x10/0x10 [md_mod] [ 272.802107] ? kthread+0xf1/0xfc [ 272.802521] make_task_dead+0x113/0x113 [ 272.802935] rewind_stack_and_make_dead+0x17/0x17 [ 272.803347] RIP: 0000:0x0 [ 272.803757] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 272.804181] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 272.804621] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 272.805061] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 272.805480] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 272.805890] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 272.806306] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 272.806701] </TASK> [ 272.807098] ---[ end trace 0000000000000000 ]--- Seem to be able to read/write fine, but parity check throws this kernel issue. I can't shutdown if i hit this.
-
Kernel issues, Bugs can't run parity
[ 546.817281] kernel BUG at drivers/md/unraid.c:1617! [ 546.817796] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 546.818279] CPU: 8 PID: 15431 Comm: unraidd0 Tainted: P O 6.1.106-Unraid #1 [ 546.818772] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 546.819262] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 546.819759] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 83 65 a0 48 8b 73 20 e8 82 1e 21 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 546.820792] RSP: 0018:ffffc900018a7df0 EFLAGS: 00010246 [ 546.821305] RAX: 0000000000000000 RBX: ffff888171e428d8 RCX: 0000000000000000 [ 546.821826] RDX: 0000000000000000 RSI: ffffffff829f0720 RDI: ffff8881312a9a38 [ 546.822343] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 [ 546.822858] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888141cda120 [ 546.823367] R13: ffff888171e42c30 R14: ffff888171e42ca8 R15: ffff88813e25d458 [ 546.823868] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 546.824375] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 546.824901] CR2: 0000146db4f7a840 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 546.825422] PKRU: 55555554 [ 546.825936] Call Trace: [ 546.826423] <TASK> [ 546.826900] ? __die_body+0x1a/0x5c [ 546.827386] ? die+0x30/0x49 [ 546.827847] ? do_trap+0x7b/0xfe [ 546.828302] ? unraidd+0x1051/0x1140 [md_mod] [ 546.828752] ? unraidd+0x1051/0x1140 [md_mod] [ 546.829194] ? do_error_trap+0x6e/0x98 [ 546.829647] ? unraidd+0x1051/0x1140 [md_mod] [ 546.830114] ? exc_invalid_op+0x4c/0x60 [ 546.830568] ? unraidd+0x1051/0x1140 [md_mod] [ 546.831019] ? asm_exc_invalid_op+0x16/0x20 [ 546.831451] ? unraidd+0x1051/0x1140 [md_mod] [ 546.831861] md_thread+0xf4/0x122 [md_mod] [ 546.832263] ? _raw_spin_rq_lock_irqsave+0x20/0x20 [ 546.832664] ? signal_pending+0x1d/0x1d [md_mod] [ 546.833063] kthread+0xe4/0xef [ 546.833453] ? kthread_complete_and_exit+0x1b/0x1b [ 546.833849] ret_from_fork+0x1f/0x30 [ 546.834240] </TASK> [ 546.834620] Modules linked in: veth ipvlan xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e zfs(PO) intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm zunicode(PO) ast zzstd(O) drm_vram_helper i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper sha512_ssse3 zlua(O) sha256_ssse3 sha1_ssse3 zavl(PO) aesni_intel mei_hdcp crypto_simd cryptd icp(PO) mei_pxp drm zcommon(PO) rapl [ 546.834652] znvpair(PO) spl(O) ipmi_ssif intel_cstate mei_me agpgart i2c_i801 wmi_bmof mpt3sas syscopyarea mpi3mr sysfillrect i2c_smbus input_leds sysimgblt nvme ahci raid_class intel_uncore joydev led_class acpi_ipmi fb_sys_fops i2c_core mei nvme_core scsi_transport_sas libahci thermal fan video wmi ipmi_si backlight intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 546.840354] ---[ end trace 0000000000000000 ]--- [ 547.483269] RIP: 0010:unraidd+0x1051/0x1140 [md_mod] [ 547.483904] Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 83 65 a0 48 8b 73 20 e8 82 1e 21 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 [ 547.485209] RSP: 0018:ffffc900018a7df0 EFLAGS: 00010246 [ 547.485837] RAX: 0000000000000000 RBX: ffff888171e428d8 RCX: 0000000000000000 [ 547.486459] RDX: 0000000000000000 RSI: ffffffff829f0720 RDI: ffff8881312a9a38 [ 547.487075] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 [ 547.487694] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888141cda120 [ 547.488311] R13: ffff888171e42c30 R14: ffff888171e42ca8 R15: ffff88813e25d458 [ 547.488924] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 547.489553] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 547.490174] CR2: 0000146db4f7a840 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 547.490804] PKRU: 55555554 [ 547.491418] ------------[ cut here ]------------ [ 547.492026] WARNING: CPU: 8 PID: 15431 at kernel/exit.c:816 do_exit+0x87/0x923 [ 547.492639] Modules linked in: veth ipvlan xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e zfs(PO) intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm zunicode(PO) ast zzstd(O) drm_vram_helper i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper sha512_ssse3 zlua(O) sha256_ssse3 sha1_ssse3 zavl(PO) aesni_intel mei_hdcp crypto_simd cryptd icp(PO) mei_pxp drm zcommon(PO) rapl [ 547.492670] znvpair(PO) spl(O) ipmi_ssif intel_cstate mei_me agpgart i2c_i801 wmi_bmof mpt3sas syscopyarea mpi3mr sysfillrect i2c_smbus input_leds sysimgblt nvme ahci raid_class intel_uncore joydev led_class acpi_ipmi fb_sys_fops i2c_core mei nvme_core scsi_transport_sas libahci thermal fan video wmi ipmi_si backlight intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 547.501188] CPU: 8 PID: 15431 Comm: unraidd0 Tainted: P D O 6.1.106-Unraid #1 [ 547.502085] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 547.502981] RIP: 0010:do_exit+0x87/0x923 [ 547.503882] Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 41 30 81 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 43 2f 81 00 48 8b 83 d0 06 00 00 83 [ 547.505729] RSP: 0018:ffffc900018a7ee0 EFLAGS: 00010286 [ 547.506650] RAX: 0000000000000000 RBX: ffff88813cf7b000 RCX: 0000000000000000 [ 547.507578] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff [ 547.508505] RBP: 000000000000000b R08: 0000000000000000 R09: ffffc900016bc020 [ 547.509421] R10: 0000000000aaaaaa R11: 0000000000000001 R12: ffff888130b7f400 [ 547.510325] R13: ffff888104d90840 R14: 0000000000000002 R15: ffffffff820b3185 [ 547.511223] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 547.512110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 547.512975] CR2: 0000146db4f7a840 CR3: 000000000420a005 CR4: 0000000000770ee0 [ 547.513836] PKRU: 55555554 [ 547.514663] Call Trace: [ 547.515466] <TASK> [ 547.516250] ? __warn+0xab/0x122 [ 547.517009] ? report_bug+0x109/0x17e [ 547.517751] ? do_exit+0x87/0x923 [ 547.518473] ? handle_bug+0x41/0x6f [ 547.519166] ? exc_invalid_op+0x13/0x60 [ 547.519868] ? asm_exc_invalid_op+0x16/0x20 [ 547.520559] ? do_exit+0x87/0x923 [ 547.521222] make_task_dead+0x11c/0x11c [ 547.521879] rewind_stack_and_make_dead+0x17/0x17 [ 547.522539] RIP: 0000:0x0 [ 547.523190] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 547.523861] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 547.524549] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 547.525219] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 547.525876] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 547.526526] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 547.527159] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 547.527791] </TASK> No luck.
-
Kernel issues, Bugs can't run parity
If i'm seeing this right I've either got hardware failure (I really hope not) or there's something wrong with unraidd0 on kernel 6.1.118-Unraid [ 2310.522979] BUG: unable to handle page fault for address: 0000000000001358 [ 2310.523471] #PF: supervisor read access in kernel mode [ 2310.523958] #PF: error_code(0x0000) - not-present page [ 2310.524445] PGD 0 P4D 0 [ 2310.524920] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 2310.525396] CPU: 8 PID: 15516 Comm: unraidd0 Tainted: P O 6.1.118-Unraid #1 Going to try roll back to 6.12.13 with 6.1.106-Unraid, and see if i can get a valid parity check.
-
Kernel issues, Bugs can't run parity
-
Kernel issues, Bugs can't run parity
Any insight? [ 104.608783] IPv6: ADDRCONF(NETDEV_CHANGE): veth17ee8b2: link becomes ready [ 2310.522979] BUG: unable to handle page fault for address: 0000000000001358 [ 2310.523471] #PF: supervisor read access in kernel mode [ 2310.523958] #PF: error_code(0x0000) - not-present page [ 2310.524445] PGD 0 P4D 0 [ 2310.524920] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 2310.525396] CPU: 8 PID: 15516 Comm: unraidd0 Tainted: P O 6.1.118-Unraid #1 [ 2310.525884] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 2310.526375] RIP: 0010:bio_associate_blkg_from_css+0x166/0x18b [ 2310.526875] Code: 7f 30 eb e9 e8 bb d2 cc ff eb 2d 48 8b 45 08 48 8b 80 58 03 00 00 48 8b b8 88 01 00 00 48 83 c7 38 e8 f3 f4 ff ff 48 8b 45 08 <48> 8b 80 58 03 00 00 4c 8b b8 88 01 00 00 4c 89 7d 48 48 83 c4 20 [ 2310.527921] RSP: 0018:ffffc90005bf7d68 EFLAGS: 00010202 [ 2310.528434] RAX: 0000000000001000 RBX: ffffffff829f1720 RCX: 0000000000000000 [ 2310.528952] RDX: ffff88813c68e000 RSI: ffffffff829f1720 RDI: ffff8881096eb038 [ 2310.529463] RBP: ffff888151f8e810 R08: 0000000000000000 R09: 0000000000000000 [ 2310.529971] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 2310.530470] R13: ffff888151f8e810 R14: ffff888151f8e888 R15: ffff88813e821a58 [ 2310.530967] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 2310.531472] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2310.531973] CR2: 0000000000001358 CR3: 000000000420a000 CR4: 0000000000750ee0 [ 2310.532476] PKRU: 55555554 [ 2310.532969] Call Trace: [ 2310.533455] <TASK> [ 2310.533928] ? __die_body+0x1a/0x5c [ 2310.534402] ? page_fault_oops+0x329/0x376 [ 2310.534872] ? do_user_addr_fault+0x12e/0x465 [ 2310.535335] ? exc_page_fault+0xfb/0x11d [ 2310.535802] ? asm_exc_page_fault+0x22/0x30 [ 2310.536258] ? bio_associate_blkg_from_css+0x166/0x18b [ 2310.536715] ? bio_associate_blkg_from_css+0x162/0x18b [ 2310.537153] ? submit_bio_noacct_nocheck+0x134/0x269 [ 2310.537592] bio_associate_blkg+0x2f/0x35 [ 2310.538018] bio_init+0x59/0x92 [ 2310.538435] unraidd+0xfe0/0x1140 [md_mod] [ 2310.538854] md_thread+0xf4/0x122 [md_mod] [ 2310.539269] ? _raw_spin_rq_lock_irqsave+0x20/0x20 [ 2310.539684] ? signal_pending+0x1d/0x1d [md_mod] [ 2310.540095] kthread+0xe4/0xef [ 2310.540499] ? kthread_complete_and_exit+0x1b/0x1b [ 2310.540906] ret_from_fork+0x1f/0x30 [ 2310.541314] </TASK> [ 2310.541710] Modules linked in: ipvlan veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp zfs(PO) coretemp kvm_intel zunicode(PO) kvm zzstd(O) zlua(O) ast crct10dif_pclmul drm_vram_helper crc32_pclmul crc32c_intel i2c_algo_bit zavl(PO) ghash_clmulni_intel drm_ttm_helper sha512_ssse3 ttm sha256_ssse3 sha1_ssse3 icp(PO) aesni_intel drm_kms_helper crypto_simd cryptd drm zcommon(PO) rapl mei_hdcp mei_pxp [ 2310.541741] znvpair(PO) agpgart i2c_i801 ipmi_ssif intel_cstate syscopyarea spl(O) mpt3sas nvme sysfillrect mei_me i2c_smbus input_leds ahci wmi_bmof sysimgblt intel_uncore mpi3mr acpi_ipmi raid_class video joydev fb_sys_fops i2c_core led_class libahci nvme_core mei thermal scsi_transport_sas fan wmi backlight ipmi_si intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 2310.547653] CR2: 0000000000001358 [ 2310.549533] ---[ end trace 0000000000000000 ]--- [ 2311.182237] RIP: 0010:bio_associate_blkg_from_css+0x166/0x18b [ 2311.182893] Code: 7f 30 eb e9 e8 bb d2 cc ff eb 2d 48 8b 45 08 48 8b 80 58 03 00 00 48 8b b8 88 01 00 00 48 83 c7 38 e8 f3 f4 ff ff 48 8b 45 08 <48> 8b 80 58 03 00 00 4c 8b b8 88 01 00 00 4c 89 7d 48 48 83 c4 20 [ 2311.184209] RSP: 0018:ffffc90005bf7d68 EFLAGS: 00010202 [ 2311.184847] RAX: 0000000000001000 RBX: ffffffff829f1720 RCX: 0000000000000000 [ 2311.185487] RDX: ffff88813c68e000 RSI: ffffffff829f1720 RDI: ffff8881096eb038 [ 2311.186122] RBP: ffff888151f8e810 R08: 0000000000000000 R09: 0000000000000000 [ 2311.186759] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 2311.187392] R13: ffff888151f8e810 R14: ffff888151f8e888 R15: ffff88813e821a58 [ 2311.188026] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 2311.188675] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2311.189313] CR2: 0000000000001358 CR3: 000000000420a000 CR4: 0000000000750ee0 [ 2311.189946] PKRU: 55555554 [ 2311.190572] note: unraidd0[15516] exited with irqs disabled [ 2311.191244] ------------[ cut here ]------------ [ 2311.191892] WARNING: CPU: 8 PID: 15516 at kernel/exit.c:816 do_exit+0x87/0x923 [ 2311.192546] Modules linked in: ipvlan veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter ipmi_devintf md_mod xfs tcp_diag inet_diag i915 drm_buddy drm_display_helper intel_gtt iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs macvtap macvlan tap af_packet 8021q garp mrp bridge stp llc bonding tls igc e1000e intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp zfs(PO) coretemp kvm_intel zunicode(PO) kvm zzstd(O) zlua(O) ast crct10dif_pclmul drm_vram_helper crc32_pclmul crc32c_intel i2c_algo_bit zavl(PO) ghash_clmulni_intel drm_ttm_helper sha512_ssse3 ttm sha256_ssse3 sha1_ssse3 icp(PO) aesni_intel drm_kms_helper crypto_simd cryptd drm zcommon(PO) rapl mei_hdcp mei_pxp [ 2311.192575] znvpair(PO) agpgart i2c_i801 ipmi_ssif intel_cstate syscopyarea spl(O) mpt3sas nvme sysfillrect mei_me i2c_smbus input_leds ahci wmi_bmof sysimgblt intel_uncore mpi3mr acpi_ipmi raid_class video joydev fb_sys_fops i2c_core led_class libahci nvme_core mei thermal scsi_transport_sas fan wmi backlight ipmi_si intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: igc] [ 2311.201228] CPU: 8 PID: 15516 Comm: unraidd0 Tainted: P D O 6.1.118-Unraid #1 [ 2311.202132] Hardware name: Supermicro Super Server/X13SAE-F, BIOS 3.3b 08/26/2024 [ 2311.203041] RIP: 0010:do_exit+0x87/0x923 [ 2311.203948] Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 1f 47 81 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 21 46 81 00 48 8b 83 d0 06 00 00 83 [ 2311.205819] RSP: 0018:ffffc90005bf7ee0 EFLAGS: 00010286 [ 2311.206758] RAX: 0000000000000000 RBX: ffff88813c68e000 RCX: 0000000000000000 [ 2311.207709] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff [ 2311.208640] RBP: 0000000000000009 R08: 0000000000000000 R09: ffffc90002c8c020 [ 2311.209547] R10: 0000000000aaaaaa R11: 0000000000000001 R12: ffff88810206b000 [ 2311.210438] R13: ffff888102067380 R14: 0000000000000000 R15: 0000000000000000 [ 2311.211308] FS: 0000000000000000(0000) GS:ffff889fff800000(0000) knlGS:0000000000000000 [ 2311.212167] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2311.213006] CR2: 0000000000001358 CR3: 000000000420a000 CR4: 0000000000750ee0 [ 2311.213833] PKRU: 55555554 [ 2311.214635] Call Trace: [ 2311.215412] <TASK> [ 2311.216168] ? __warn+0xab/0x122 [ 2311.216906] ? report_bug+0x109/0x17e [ 2311.217622] ? do_exit+0x87/0x923 [ 2311.218335] ? handle_bug+0x41/0x6f [ 2311.219042] ? exc_invalid_op+0x13/0x60 [ 2311.219736] ? asm_exc_invalid_op+0x16/0x20 [ 2311.220415] ? do_exit+0x87/0x923 [ 2311.221086] make_task_dead+0x11c/0x11c [ 2311.221752] rewind_stack_and_make_dead+0x17/0x17 [ 2311.222419] RIP: 0000:0x0 [ 2311.223082] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 2311.223759] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 2311.224441] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 2311.225115] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 2311.225778] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 2311.226428] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 2311.227059] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 2311.227693] </TASK> [ 2311.228313] ---[ end trace 0000000000000000 ]--- [11200.506017] .NET ThreadPool[18089]: segfault at 3b6303668 ip 000014e8b7bc3abb sp 000014e8b62febf0 error 6 in libclrjit.so[14e8b7aa6000+1e1000] likely on CPU 8 (core 16, socket 0) [11200.507245] Code: ff ff ff ff 41 89 46 14 e9 df fd ff ff 48 8d 7d b0 4c 89 fe 41 89 d8 48 8b 5d 80 48 89 da 4c 89 e1 e8 f9 f1 ff ff 48 8b 45 c0 <49> 89 46 10 0f 10 45 b0 41 0f 11 06 4c 89 ff 48 89 de 4c 89 e2 4c [14172.070513] elogind-daemon[2090]: New session c1 of user root. Plex, downloads, etc seem to work fine but parity hangs at a low percentage and I can't kill it. Had to reboot. beyonder-nas-diagnostics-20250206-2049.zip