Jump to content

Duggie264

Members
  • Posts

    42
  • Joined

  • Last visited

Report Comments posted by Duggie264

  1. On 5/3/2019 at 5:25 PM, limetech said:

    That is HP's "hpsa" driver:

    https://sourceforge.net/projects/cciss/files/hpsa-3.0-tarballs/

     

    The version in Linux kernel 4.19 is "3.4.20-125".  You can see from above link that there are newer hpsa versions, but those drivers are designed to be built/integrated into RedHat variation of the kernel and do not build with our kernel.

     

    Checking kernel 5.0 and 5.1, reveals they all also use hpsa driver "3.4.20-125".  Why hardware vendors insist to maintain

     

    own driver vs. the stock kernel driver is a mystery.  Eventually someone who HP cares about will complain and they'll update the stock kernel driver.

    So I emailed the maintainers from the SF link you provided, and this was their response:

     

    Hi Duggie,

    a quick glance at the UNRAID changelog it looks like in unRAID 6.6.7 they are using the 4.18.20 kernel. Do you know what kernel is in the newer RC versions?

     

    Feel free to submit the diagnostic logs to us. We will take a look. Contrary to their comments, we do maintain the kernel driver as well and have some patches staged to go upstream soon.

     

    Thanks,
    Scott

     

    AND ON SENDING MY DIAG LOGS AND THE UNRAID CHANGE-LOGS FOR THE 6.7.0-rcX FAMILY RCVD THIS FROM DON
     

    I see a DMAR error logged. That may be what caused the controller lockup, which caused the OS to send down a reset to the drive.

     

    Are you able to update the driver and build it for a test? If so, there is  a structure member in the scsi_host_template called .max_sectors.

    It is set to 2048, wondering if you can change it to 1024 for a test?

     

    If not, I would have to know what OS I could do the build for you on. Not real sure about unraid.

     

     

    Feb 25 23:03:09 TheNewdaleBeast kernel: DMAR: DRHD: handling fault status reg 2

    Feb 25 23:03:09 TheNewdaleBeast kernel: DMAR: [DMA Read] Request device [81:00.0] fault addr fe8c0000 [fault reason 06] PTE Read access is not set

    Feb 25 23:03:40 TheNewdaleBeast kernel: hpsa 0000:81:00.0: scsi 14:0:7:0: resetting physical  Direct-Access     SEAGATE  ST4000NM0023     PHYS DRV SSDSmartPathCap- En- Exp=1

    Feb 25 23:03:57 TheNewdaleBeast avahi-daemon[4764]: Leaving mDNS multicast group on interface br0.IPv6 with address fe80::1085:73ff:fedb:90d4.

    Feb 25 23:03:57 TheNewdaleBeast avahi-daemon[4764]: Joining mDNS multicast group on interface br0.IPv6 with address fd05:820d:9f35:1:d250:99ff:fec2:52fb.

    Feb 25 23:03:57 TheNewdaleBeast avahi-daemon[4764]: Registering new address record for fd05:820d:9f35:1:d250:99ff:fec2:52fb on br0.*.

    Feb 25 23:03:57 TheNewdaleBeast avahi-daemon[4764]: Withdrawing address record for fe80::1085:73ff:fedb:90d4 on br0.

    Feb 25 23:03:58 TheNewdaleBeast ntpd[3173]: Listen normally on 6 br0 [fd05:820d:9f35:1:d250:99ff:fec2:52fb]:123

    Feb 25 23:03:58 TheNewdaleBeast ntpd[3173]: new interface(s) found: waking up resolver

    Feb 25 23:04:33 TheNewdaleBeast kernel: hpsa 0000:81:00.0: Controller lockup detected: 0x00130000 after 30

    Feb 25 23:04:33 TheNewdaleBeast kernel: hpsa 0000:81:00.0: controller lockup detected: LUN:0000000000800601 CDB:01030000000000000000000000000000

    Feb 25 23:04:33 TheNewdaleBeast kernel: hpsa 0000:81:00.0: Controller lockup detected during reset wait

    Feb 25 23:04:33 TheNewdaleBeast kernel: hpsa 0000:81:00.0: scsi 14:0:7:0: reset physical  failed Direct-Access     SEAGATE  ST4000NM0023     PHYS DRV SSDSmartPathCap- En- Exp=1

    Feb 25 23:04:33 TheNewdaleBeast kernel: sd 14:0:7:0: Device offlined - not ready after error recovery

     

     

    @limetech would you be able to assist as I am currently about 8000 feet below sea level, with only a snorkel for comfort!

  2. On 5/3/2019 at 5:25 PM, limetech said:

    That is HP's "hpsa" driver:

    https://sourceforge.net/projects/cciss/files/hpsa-3.0-tarballs/

     

    The version in Linux kernel 4.19 is "3.4.20-125".  You can see from above link that there are newer hpsa versions, but those drivers are designed to be built/integrated into RedHat variation of the kernel and do not build with our kernel.

     

    Checking kernel 5.0 and 5.1, reveals they all also use hpsa driver "3.4.20-125".  Why hardware vendors insist to maintain their own driver vs. the stock kernel driver is a mystery.  Eventually someone who HP cares about will complain and they'll update the stock kernel driver.

    @johnnie.black @limetech

     

    Thanks for your response, guess I'll just have to stick with 6.6.7 for the foreseeable, in the meantime I have emailed HP to try and hasten a solution - I will update you if I get a response,

     

    Regards

     

    Duggie

  3. Hi,

     

    I still have the same problem with this update that I have had with every 6.7.0RCx version, namely that once I reboot, multiple drives fail to mount/are disabled. On reversion to 6.6.7 all the drives are fine.

    I have tried:

    Install --> reboot

    Install --> Power down all VMs and containers --> Turn off Docker and VM manager --> Reboot

    Install --> Power down all VMs and containers --> Turn off Docker and VM manager --> Stop Array --> Reboot

    Regardless of the process I follow, loads of drives appear failed/corrupt/disabled after reboot and array started.

     

    reversion to 6.6.7 returns system to normal operations, except disks 2 and 5 which remain disabled.

     

    See attached diagnostics, taken after upgrade and reboot, followed by reversion

    .thenewdalebeast-diagnostics-20190501-0848.zip

     

    If I get some time this weekend, I will do another upgrade, and pull diags at every step.

     

  4. was working fine on latest stable (6.6.7)  closed down all of my VMs and containers. ensured mover had finished. backed up flash stopped array and then updated to next 6.7.0-rc6. b

     Better than previous attempts to upgrade, as far as on reboot all drives were detected in correct slots, however on starting array errors everywhere.

     

    log is full of errors:

    Mar 30 23:26:05 TheNewdaleBeast emhttpd: error: get_filesystem_status, 6474: Input/output error (5): scandir
    Mar 30 23:26:05 TheNewdaleBeast kernel: XFS (md2): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x15d508f48 len 32 error 5
    Mar 30 23:26:05 TheNewdaleBeast kernel: XFS (md2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
    Mar 30 23:26:06 TheNewdaleBeast emhttpd: error: get_filesystem_status, 6474: Input/output error (5): scandir
    Mar 30 23:26:06 TheNewdaleBeast kernel: XFS (md2): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x15d508f48 len 32 error 5
    Mar 30 23:26:06 TheNewdaleBeast kernel: XFS (md2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
    Mar 30 23:26:07 TheNewdaleBeast emhttpd: error: get_filesystem_status, 6474: Input/output error (5): scandir
    Mar 30 23:26:07 TheNewdaleBeast kernel: XFS (md2): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x15d508f48 len 32 error 5
    ............................

    screenshot shows five appearing as unmountable (were mounted and functioning with no problem in stable 6.6.7)

     

    also attached is the diagnostics log.

     

    going to revert back again to a functioning system!

    Screenshot 2019-03-30 at 23.30.44.png

    thenewdalebeast-diagnostics-20190330-2333.zip

  5. Just updated 6.6.6 to 6.6.7, stable with no issues.
    I then updated to 6.7 rc5, and suffered the same problem as when I tried to update from 6.6.6 to 6.7 rc2 (although I didn't post at the time for various reasons)
     

    When I reboot, Multiple drives show as failed/unassigned. I reverted back to 6.6.7, now my second parity, and one of my data drives (drive 2), show as disabled. (in the previously unreported instance this was parity 2 and Drive 6).

    I have attached the diagnostic log in case it is of use.

     

    cheers  

     

    Duggie

    thenewdalebeast-diagnostics-20190225-2307.zip - failed 6.7rc5 install - FS not present, array in tatters

     

     

    thenewdalebeast-diagnostics-20190225-2329.zip - restore to 6.6.7 - 1 parity and 1 array drive borked

×
×
  • Create New...