Jump to content

JorgeB

Moderators
  • Posts

    67,125
  • Joined

  • Last visited

  • Days Won

    703

Everything posted by JorgeB

  1. If anyone else wants to try: ipmitool raw 0x30 0x70 0x66 0x01 0x00 0x64 00 - Get value 01 - Set value 00 - FAN 1/2/3/4 01 - FAN A (possibly different groupings on some models) 00 to 64 - Speed (64 is max)
  2. Nice find, also works on my X11, apparently it works on (all?) X10 and X11.
  3. AFAIK you can ask for two 15 day extensions and they are automatic.
  4. There should be any difference, I tested mine and can easily do +500MB/s with a single (or +450MB/s with up to 4) devices. With unRAID on a test server doing a parity check (or read check, that is al devices assigned as data devices without parity)
  5. Onboard Intel ports are some of the best ports you can use in terms of stability, but they are not faster than HBAs, in fact they can be considerably slower when all ports ares in use, for now with SSDs only, but everything below Skylake is limited to about 1500/1600 MB/s total for all ports combined, and skylake and above are inexplicably limited to 2000MB/s, when in theory with DMI 3.0 it should do above 3000MB/s, I was able to get +3500MB/s total from a LSI HBA, and it wasn't the limit as that was the max speed of the SSDs I used.
  6. Since we're missing the pre-reboot diags we can't see what happened, but SMART for parity looks fine, couple of CRC errors so you should replace the SATA cable to rule it out and then rebuild to the same disk, to do that: -stop the array -unassign parity -start the array -stop the array -reassgin parity -start the array to begin the parity sync If it happens again grab the diags before rebooting. PS: three more disks have UDMA_CRC errors, these may or not be old, you should monitor them for a couple of weeks, any increase replace the SATA cable.
  7. If you have them post the diags pre-reboot, if not post the current diagnostics, not just the syslog.
  8. Same here, OP are your disks reiser?
  9. See here for some mods to improve cooling in the DS380:
  10. You clearly need better cooling, none of my disks goes above 40C during parity checks, and 60C is the max for most disks, and you don't want to be nowhere near that, <45C is considered safe.
  11. You're having multiple issues: 1) the 4 disks connected on the port multiplier were missing at boot-up: May 13 00:27:27 nasvm kernel: mdcmd (10): import 9 May 13 00:27:27 nasvm kernel: md: import_slot: 9 missing May 13 00:27:27 nasvm kernel: mdcmd (11): import 10 May 13 00:27:27 nasvm kernel: md: import_slot: 10 missing May 13 00:27:27 nasvm kernel: mdcmd (12): import 11 May 13 00:27:27 nasvm kernel: md: import_slot: 11 missing May 13 00:27:27 nasvm kernel: mdcmd (13): import 12 May 13 00:27:27 nasvm kernel: md: import_slot: 12 missing The controller reset itself and the 4 disks appeared, have read of issues with asmedia + port multiplier before, though after this initial hiccup they behaved. May 13 00:27:59 nasvm kernel: ata7: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0xe frozen May 13 00:27:59 nasvm kernel: ata7: irq_stat 0x00400040, connection status changed May 13 00:27:59 nasvm kernel: ata7: SError: { PHYRdyChg CommWake DevExch } May 13 00:27:59 nasvm kernel: ata7: hard resetting link May 13 00:28:05 nasvm kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:05 nasvm kernel: ata7.15: Port Multiplier 1.2, 0x197b:0x575f r0, 15 ports, feat 0x5/0xf May 13 00:28:05 nasvm kernel: ata7.00: hard resetting link May 13 00:28:19 nasvm kernel: ata7.15: qc timeout (cmd 0xe4) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x4) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 1 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.00: failed to read SCR 0 (Emask=0x40) May 13 00:28:19 nasvm kernel: ata7.01: hard resetting link May 13 00:28:19 nasvm kernel: ata7.01: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:19 nasvm kernel: ata7.02: hard resetting link May 13 00:28:19 nasvm kernel: ata7.02: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:19 nasvm kernel: ata7.03: hard resetting link May 13 00:28:20 nasvm kernel: ata7.03: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 00:28:20 nasvm kernel: ata7.04: hard resetting link May 13 00:28:20 nasvm kernel: ata7.04: SATA link down (SStatus 0 SControl 330) May 13 00:28:20 nasvm kernel: ata7.05: hard resetting link May 13 00:28:21 nasvm kernel: ata7.05: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.06: hard resetting link May 13 00:28:21 nasvm kernel: ata7.06: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.07: hard resetting link May 13 00:28:21 nasvm kernel: ata7.07: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.08: hard resetting link May 13 00:28:21 nasvm kernel: ata7.08: SATA link down (SStatus 0 SControl 330) May 13 00:28:21 nasvm kernel: ata7.09: hard resetting link May 13 00:28:22 nasvm kernel: ata7.09: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.10: hard resetting link May 13 00:28:22 nasvm kernel: ata7.10: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.11: hard resetting link May 13 00:28:22 nasvm kernel: ata7.11: SATA link down (SStatus 0 SControl 330) May 13 00:28:22 nasvm kernel: ata7.12: hard resetting link May 13 00:28:23 nasvm kernel: ata7.12: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.13: hard resetting link May 13 00:28:23 nasvm kernel: ata7.13: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.14: hard resetting link May 13 00:28:23 nasvm kernel: ata7.14: SATA link down (SStatus 0 SControl 330) May 13 00:28:23 nasvm kernel: ata7.00: ATA-9: ST3000DM001-1ER166, Z500QE1D, CC25, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.00: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.00: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.01: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4E1VYSE25, 82.00A82, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.01: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.01: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.02: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4ECK3HRUH, 80.00A80, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.02: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.02: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7.03: ATA-9: WDC WD40EFRX-68WT0N0, WD-WCC4ECK3HKDE, 80.00A80, max UDMA/133 May 13 00:28:23 nasvm kernel: ata7.03: 7814037168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA May 13 00:28:23 nasvm kernel: ata7.03: configured for UDMA/133 May 13 00:28:23 nasvm kernel: ata7: EH complete 2) Parity disk is having issues, these look to me like an actual disk problem: May 13 02:59:01 nasvm kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 13 02:59:01 nasvm kernel: ata5.00: failed command: READ DMA EXT May 13 02:59:01 nasvm kernel: ata5.00: cmd 25/00:40:d8:6f:c0/00:05:3c:02:00/e0 tag 20 dma 688128 in May 13 02:59:01 nasvm kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) May 13 02:59:01 nasvm kernel: ata5.00: status: { DRDY } May 13 02:59:01 nasvm kernel: ata5: hard resetting link May 13 02:59:10 nasvm kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 13 02:59:10 nasvm kernel: ata5.00: configured for UDMA/133 May 13 02:59:10 nasvm kernel: ata5: EH complete May 13 03:00:25 nasvm kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 May 13 03:00:25 nasvm kernel: ata5.00: irq_stat 0x40000001 May 13 03:00:25 nasvm kernel: ata5.00: failed command: READ DMA EXT May 13 03:00:25 nasvm kernel: ata5.00: cmd 25/00:40:78:33:c0/00:05:3c:02:00/e0 tag 25 dma 688128 in May 13 03:00:25 nasvm kernel: res 53/40:00:78:36:c0/00:00:3c:02:00/00 Emask 0x8 (media error) May 13 03:00:25 nasvm kernel: ata5.00: status: { DRDY SENSE ERR } May 13 03:00:25 nasvm kernel: ata5.00: error: { UNC } May 13 03:00:25 nasvm kernel: ata5.00: configured for UDMA/133 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 Sense Key : 0x3 [current] May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 ASC=0x11 ASCQ=0x0 May 13 03:00:25 nasvm kernel: sd 5:0:0:0: [sdf] tag#25 CDB: opcode=0x88 88 00 00 00 00 02 3c c0 33 78 00 00 05 40 00 00 May 13 03:00:25 nasvm kernel: blk_update_request: I/O error, dev sdf, sector 9609163640 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163576 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163584 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163592 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163600 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163608 May 13 03:00:25 nasvm kernel: md: disk0 read error, sector=9609163616 May 13 03:00:25 nasvm kernel: ata5: EH complete 3) the 4 disks on the Marvell controller are timing out multiple times: May 14 16:08:43 nasvm kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 16:08:43 nasvm kernel: ata13.00: failed command: IDENTIFY DEVICE May 14 16:08:43 nasvm kernel: ata13.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 16 pio 512 in May 14 16:08:43 nasvm kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) May 14 16:08:43 nasvm kernel: ata13.00: status: { DRDY } May 14 16:08:43 nasvm kernel: ata13: hard resetting link May 14 16:08:43 nasvm kernel: ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 16:08:43 nasvm kernel: ata13.00: configured for UDMA/133 May 14 16:08:43 nasvm kernel: ata13: EH complete May 14 16:39:07 nasvm kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 16:39:07 nasvm kernel: ata11.00: failed command: SMART May 14 16:39:07 nasvm kernel: ata11.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 9 pio 512 in May 14 16:39:07 nasvm kernel: res 40/00:ff:ff:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) May 14 16:39:07 nasvm kernel: ata11.00: status: { DRDY } May 14 16:39:07 nasvm kernel: ata11: hard resetting link May 14 16:39:08 nasvm kernel: ata11: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 16:39:08 nasvm kernel: ata11.00: configured for UDMA/133 May 14 16:39:08 nasvm kernel: ata11: EH complete May 14 18:09:36 nasvm kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 18:09:36 nasvm kernel: ata14.00: failed command: SMART May 14 18:09:36 nasvm kernel: ata14.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 12 pio 512 in May 14 18:09:36 nasvm kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) May 14 18:09:36 nasvm kernel: ata14.00: status: { DRDY } May 14 18:09:36 nasvm kernel: ata14: hard resetting link May 14 18:09:36 nasvm kernel: ata14: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 18:09:36 nasvm kernel: ata14.00: configured for UDMA/133 May 14 18:09:36 nasvm kernel: ata14: EH complete ... May 14 22:11:47 nasvm kernel: ata12.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen May 14 22:11:47 nasvm kernel: ata12.00: failed command: SMART May 14 22:11:47 nasvm kernel: ata12.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 9 pio 512 in May 14 22:11:47 nasvm kernel: res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) May 14 22:11:47 nasvm kernel: ata12.00: status: { DRDY } May 14 22:11:47 nasvm kernel: ata12: hard resetting link May 14 22:11:48 nasvm kernel: ata12: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 14 22:11:48 nasvm kernel: ata12.00: configured for UDMA/133 May 14 22:11:48 nasvm kernel: ata12: EH complete This is a know issue with these controllers, advise to replace it with an LSI (get an 8 port one and can rid of the port multiplier also)
  12. No, any new disk added to an array needs to be cleared first so parity remains valid.
  13. Using the default high water disk1 should fill to 50%, then disks 2 and 3 to 50%, then disk1 again to 75%, and so on.
  14. Not automatically, you can use the unbalance plugin.
  15. That's not what your diags show, can't say the full name of the shares because it's anonymized but these 3 are split level 1 (split top level only): a-----a d------s s-------m
  16. You have shares with split level 1, split level overrides included disks.
  17. Check included and excluded disks both for the share and in the global share settings, if you can't find the problem post your diags.
  18. Doubt it's related to them being ZFS, existing filesystem should have no impact on preclear, something else is going on.
  19. Yes, controller need to support it also.
  20. I like them very much but be aware that development and software support were discontinued by Ubiquiti (although they still sell them).
  21. That's a good idea, I'll try it when I get the chance.
  22. My interest was comparing the HBAs with the Intel onboard controller, so I measured with the 8 onboard ports connected at idle and during a read check and repeated with the 8 devices connected on the HBAs, I used a Ubiquiti MFI that from what I've read gives very accurate readings: Idle - Read check Onboard - 41 - 51 SAS2LP - 47 - 57 LSI 9211 - 47 - 57 LSI 9207 - 50 - 60 Rounded to the closest watt. Note: CPU usage represents the biggest part of the increase during the read check, I have no way of measuring just the HBAs.
  23. Interesting, I measured 6w from both of these, idling with 8 devices connected.
×
×
  • Create New...