JorgeB Posted January 12, 2017 Share Posted January 12, 2017 That's fine, some times there are much bigger differences, they are completely normal. Quote Link to comment
Rich Posted January 12, 2017 Author Share Posted January 12, 2017 That's ok then Quote Link to comment
Rich Posted January 13, 2017 Author Share Posted January 13, 2017 Well, looks like i don't have to wait a week after all :'( I ran an initial parity check (the first parity check after a reboot has traditionally been fault free) and got 0 errors, but saw a timeout within the first hour of the check. Jan 13 00:13:40 unRAID kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 13 00:13:40 unRAID kernel: sas: trying to find task 0xffff88038c341400 Jan 13 00:13:40 unRAID kernel: sas: sas_scsi_find_task: aborting task 0xffff88038c341400 Jan 13 00:13:40 unRAID kernel: sas: sas_scsi_find_task: task 0xffff88038c341400 is aborted Jan 13 00:13:40 unRAID kernel: sas: sas_eh_handle_sas_errors: task 0xffff88038c341400 is aborted Jan 13 00:13:40 unRAID kernel: sas: ata15: end_device-8:0: cmd error handler Jan 13 00:13:40 unRAID kernel: sas: ata15: end_device-8:0: dev error handler Jan 13 00:13:40 unRAID kernel: sas: ata16: end_device-8:1: dev error handler Jan 13 00:13:40 unRAID kernel: ata15.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jan 13 00:13:40 unRAID kernel: sas: ata17: end_device-8:2: dev error handler Jan 13 00:13:40 unRAID kernel: sas: ata18: end_device-8:3: dev error handler Jan 13 00:13:40 unRAID kernel: ata15.00: failed command: READ DMA EXT Jan 13 00:13:40 unRAID kernel: ata15.00: cmd 25/00:00:08:2b:f4/00:04:27:00:00/e0 tag 29 dma 524288 in Jan 13 00:13:40 unRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 13 00:13:40 unRAID kernel: ata15.00: status: { DRDY } Jan 13 00:13:40 unRAID kernel: ata15: hard resetting link Jan 13 00:13:40 unRAID kernel: sas: sas_form_port: phy0 belongs to port0 already(1)! Jan 13 00:13:42 unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1430:mvs_I_T_nexus_reset for device[0]:rc= 0 Jan 13 00:13:42 unRAID kernel: ata15.00: configured for UDMA/133 Jan 13 00:13:42 unRAID kernel: ata15.00: device reported invalid CHS sector 0 Jan 13 00:13:42 unRAID kernel: ata15: EH complete Jan 13 00:13:42 unRAID kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 Now (its not even been 24 hours since installing the new controller and rebooting) i'm getting kernel crashes again (below). So to recap, I've changed the controller that all the disks getting errors are attached to. I've changed the sas cable running from the controller to the four disks. I've run a 24 hour mem test and found 0 errors. That leaves power cables as the only easy thing left to swap out, so i'm going to try that next. The disks that are experiencing errors are all fed from the same output on the PSU, but the cable has an extension, so it powers other drives as well (connected to the other controller), so i'm not sure how likely it is to be the cable, as i would have thought more disks would be experiencing problems. What are the odds of this being motherboard or PCIe port related? I'm struggling, to logically think of what else it can be, as its only these four drives (connected to the same controller, which I've now replaced), that are experiencing parity errors. All the drives connected to the motherboard sata ports and connected to the other controller are fine Jan 13 18:00:23 unRAID kernel: ------------[ cut here ]------------ Jan 13 18:00:23 unRAID kernel: WARNING: CPU: 0 PID: 9493 at arch/x86/kernel/cpu/perf_event_intel_ds.c:334 reserve_ds_buffers+0x110/0x33d() Jan 13 18:00:23 unRAID kernel: alloc_bts_buffer: BTS buffer allocation failure Jan 13 18:00:23 unRAID kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables vhost_net vhost macvtap macvlan tun iptable_mangle xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod ahci x86_pkg_temp_thermal coretemp kvm_intel mvsas i2c_i801 kvm e1000 i2c_core libahci libsas r8169 mii scsi_transport_sas wmi Jan 13 18:00:23 unRAID kernel: CPU: 0 PID: 9493 Comm: qemu-system-x86 Not tainted 4.4.30-unRAID #2 Jan 13 18:00:23 unRAID kernel: Hardware name: ASUS All Series/Z87-K, BIOS 1402 11/05/2014 Jan 13 18:00:23 unRAID kernel: 0000000000000000 ffff880341663920 ffffffff8136f79f ffff880341663968 Jan 13 18:00:23 unRAID kernel: 000000000000014e ffff880341663958 ffffffff8104a4ab ffffffff81020934 Jan 13 18:00:23 unRAID kernel: 0000000000000000 0000000000000001 0000000000000006 ffff880360779680 Jan 13 18:00:23 unRAID kernel: Call Trace: Jan 13 18:00:23 unRAID kernel: [<ffffffff8136f79f>] dump_stack+0x61/0x7e Jan 13 18:00:23 unRAID kernel: [<ffffffff8104a4ab>] warn_slowpath_common+0x8f/0xa8 Jan 13 18:00:23 unRAID kernel: [<ffffffff81020934>] ? reserve_ds_buffers+0x110/0x33d Jan 13 18:00:23 unRAID kernel: [<ffffffff8104a507>] warn_slowpath_fmt+0x43/0x4b Jan 13 18:00:23 unRAID kernel: [<ffffffff810f7501>] ? __kmalloc_node+0x22/0x153 Jan 13 18:00:23 unRAID kernel: [<ffffffff81020934>] reserve_ds_buffers+0x110/0x33d Jan 13 18:00:23 unRAID kernel: [<ffffffff8101b3fc>] x86_reserve_hardware+0x135/0x147 Jan 13 18:00:23 unRAID kernel: [<ffffffff8101b45e>] x86_pmu_event_init+0x50/0x1c9 Jan 13 18:00:23 unRAID kernel: [<ffffffff810ae7bd>] perf_try_init_event+0x41/0x72 Jan 13 18:00:23 unRAID kernel: [<ffffffff810aec0e>] perf_event_alloc+0x420/0x66e Jan 13 18:00:23 unRAID kernel: [<ffffffffa00bf58e>] ? kvm_dev_ioctl_get_cpuid+0x1c0/0x1c0 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffff810b0bbb>] perf_event_create_kernel_counter+0x22/0x112 Jan 13 18:00:23 unRAID kernel: [<ffffffffa00bf6d9>] pmc_reprogram_counter+0xbf/0x104 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00bf92b>] reprogram_fixed_counter+0xc7/0xd8 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa03ca987>] intel_pmu_set_msr+0xe0/0x2ca [kvm_intel] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00bfb2c>] kvm_pmu_set_msr+0x15/0x17 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00a1a57>] kvm_set_msr_common+0x921/0x983 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa03ca400>] vmx_set_msr+0x2ec/0x2fe [kvm_intel] Jan 13 18:00:23 unRAID kernel: [<ffffffffa009e424>] kvm_set_msr+0x61/0x63 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa03c39c4>] handle_wrmsr+0x3b/0x62 [kvm_intel] Jan 13 18:00:23 unRAID kernel: [<ffffffffa03c863f>] vmx_handle_exit+0xfbb/0x1053 [kvm_intel] Jan 13 18:00:23 unRAID kernel: [<ffffffffa03ca105>] ? vmx_vcpu_run+0x30e/0x31d [kvm_intel] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00a7f92>] kvm_arch_vcpu_ioctl_run+0x38a/0x1080 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00a2938>] ? kvm_arch_vcpu_load+0x6b/0x16c [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa00a29b5>] ? kvm_arch_vcpu_load+0xe8/0x16c [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa0098cff>] kvm_vcpu_ioctl+0x178/0x499 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffffa009b152>] ? kvm_vm_ioctl+0x3e8/0x5d8 [kvm] Jan 13 18:00:23 unRAID kernel: [<ffffffff8111869e>] do_vfs_ioctl+0x3a3/0x416 Jan 13 18:00:23 unRAID kernel: [<ffffffff8112070e>] ? __fget+0x72/0x7e Jan 13 18:00:23 unRAID kernel: [<ffffffff8111874f>] SyS_ioctl+0x3e/0x5c Jan 13 18:00:23 unRAID kernel: [<ffffffff81629c2e>] entry_SYSCALL_64_fastpath+0x12/0x6d Jan 13 18:00:23 unRAID kernel: ---[ end trace 9a1ea458c732cff8 ]--- Jan 13 18:00:23 unRAID kernel: qemu-system-x86: page allocation failure: order:4, mode:0x260c0c0 Jan 13 18:00:23 unRAID kernel: CPU: 0 PID: 9493 Comm: qemu-system-x86 Tainted: G W 4.4.30-unRAID #2 Quote Link to comment
Rich Posted January 14, 2017 Author Share Posted January 14, 2017 So swapping the controller seems to be causing more and / or, more frequent errors, like its somehow agitated the issue. I am now seeing end point connection errors, Jan 14 22:01:45 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:02:07 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:02:07 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:02:07 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:04:07 unRAID emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Jan 14 22:04:31 unRAID emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog Jan 14 22:07:39 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:07:39 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:08:14 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected Jan 14 22:08:17 unRAID php: /usr/local/emhttp/plugins/advanced.buttons/script/plugin 'check' 'advanced.buttons.plg' 'ca.cleanup.appdata.plg' 'ca.update.applications.plg' 'community.applications.plg' 'dynamix.active.streams.plg' 'dynamix.local.master.plg' 'dynamix.plg' 'dynamix.ssd.trim.plg' 'dynamix.system.buttons.plg' 'dynamix.system.info.plg' 'dynamix.system.stats.plg' 'dynamix.system.temp.plg' 'unassigned.devices.plg' 'unRAIDServer.plg' &>/dev/null & Jan 14 22:08:58 unRAID emhttp: err: need_authorization: getpeername: Transport endpoint is not connected I am way out of my depth in trying to diagnose this, so would appreciate any help anyone can give. I still have yet to swap out the power cables feeding the drives that are generating errors, but if its not that, i'm out of ideas After all the previously listed symptoms / errors, is anyone able to suggest anything else it could be? Quote Link to comment
JorgeB Posted January 14, 2017 Share Posted January 14, 2017 If you don't need it disable vt-d in your bios to check if it makes a difference. Quote Link to comment
Rich Posted January 14, 2017 Author Share Posted January 14, 2017 I run dockers and a Windows 10 VM, so its not ideally what I want to do, but I can try it for a while. Its been enabled for well over a year and not caused any issues though, its only since the new PSU and additional controller + cabling, that I've started seeing errors. I know I'll have to stop using the VM when disabling vt-d, but will it effect docker as well? Quote Link to comment
JorgeB Posted January 14, 2017 Share Posted January 14, 2017 It won't affect docker and you can still use the VM as long as you're not passing through any pci device. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 Awesome, thank you. I'll try that if swapping the cable doesn't help. Quote Link to comment
John_M Posted January 15, 2017 Share Posted January 15, 2017 I was going to suggest the same. The Marvell 9485 (as used by the SAS2LP) is on the list here. Several people seem to be having similar problems at the moment. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 Oh that's interesting, I hadn't come across that. Thanks for the link John_M. The only strange thing about that though, is that I've been running with one SAS2LP controller for years, its only since I have added another that I have been seeing problems. Do you think its possible that this problem could occur with one card and not the other? They're both running the same and most up to date firmware version. Quote Link to comment
JorgeB Posted January 15, 2017 Share Posted January 15, 2017 It's probably not vt-d since the issue started with only 4 disks on the same port, still no harm in trying to rule it out. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 Agreed. I'm definitely going to try it. Quote Link to comment
John_M Posted January 15, 2017 Share Posted January 15, 2017 I agree it would be good to be able to rule it out. I certainly can't enable IOMMU (in my case AMD-Vi) on my server that uses a single SAS2LP, otherwise I get contiguous blocks of errors. Turn it off and it's fine. For me that's not a problem as I only have one VM and it runs headless. If I did need IOMMU I'd use a different SAS controller. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 Well it certainly looks like it could be the same issue, I've checked the marvell chipsets on the cards and one has 9485 which is on the list and one has 9480, which isn't. If my first card is the 9480 chipset, that would certainly explain why I've only started seeing errors since the second card was installed. I've just swapped over the power cables and am waiting to see what happens, if I get errors, then I will disable vt-d. I pass through a NIC and usb connections from a UPS and Corsair link PSU, so it's gonna mean selling the card if it's this Quote Link to comment
John_M Posted January 15, 2017 Share Posted January 15, 2017 Mine has a 9485 too: 01:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3) It might be worth testing for a while longer and then trying the work around that's mentioned in that thread. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 These are mine, 01:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3) 06:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9480 SAS/SATA 6Gb/s RAID controller (rev c3) Yeah I will do. Unless I get instant errors, I've been running changes for a week before moving to the next one, so I'll do the same with the power cable, then disabling vt-d and then adding iommu=pt to syslinux.conf. I'm really hoping its the power cable, lol Thanks for pitching in, I appreciate the help. Are you aware of anyone contacting either SuperMicro or Marvell regarding the potential chipset bug? Quote Link to comment
John_M Posted January 15, 2017 Share Posted January 15, 2017 Lol. I try to help where I can but I'm not entirely selfless. Since suffering from the "Marvell bug" myself I've been trying to find out more about it but there isn't a lot of information. I've spotted a number of people with symptoms similar to yours and asked if they have IOMMU enabled. Your symptoms are similar but not identical to the ones mentioned in that thread. My own symptoms were different again but similar to someone else's. I'm trying to build up a picture but I'm not getting much feedback. All I can say for sure is that without changing anything else I can switch my problem on and off by enabling and disabling AMD-Vi. The problem appears not to be with the chipset but with the Linux driver but I don't know if there's any active attempt at fixing it. Perhaps Marvell and Supermicro don't have the resources to help. If I needed to use IOMMU I'd simply replace the card with an LSI-based one. The trouble with those is that they don't work with unRAID out of the box. They need to be reflashed with IT firmware, which ought to be almost trivially simple but many people find to be more difficult than it really should to be. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 That's frustrating, same chipset but a variation of symptoms. If it is the Linux driver, is this something Limetech could address, or at least flag to the right party? If you'd like me to try anything else to help you with your controller diagnosis, just let me know. I'll post back my results as soon as I have an update. Quote Link to comment
Squid Posted January 15, 2017 Share Posted January 15, 2017 The problem is that a minority of users with the SAS2 have that problem with IOMMU enabled. I for one don't. Which leads to the question is it the SAS2, is it the motherboard, is it the BIOS. And all of those variables does make it difficult to diagnose / fix Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 The problem is that a minority of users with the SAS2 have that problem with IOMMU enabled. I for one don't. Which leads to the question is it the SAS2, is it the motherboard, is it the BIOS. And all of those variables does make it difficult to diagnose / fix I have two SAS2's, the older of which works perfectly, but after the installation of the newer controller (and its subsequent RMA'd replacement) I have started to see the mentioned problems. So I could at least differentiate between a potential functional SAS2 and a dodgy one. Quote Link to comment
John_M Posted January 15, 2017 Share Posted January 15, 2017 @Rich: The things that interest me are the actual Marvell chip used, the symptoms, whether turning off IOMMU fixes the problem and whether the workaround allows you to turn IOMMU back on. I know the first two in your case, if you can post back your findings on the second two in due course when you've convinced yourself through testing I'd be grateful. The bug is well and truly flagged. The trouble is it was first reported five years ago and it still isn't fixed. I honestly doubt that Limetech has the resources to investigate it further. @Squid: It's further complicated by different versions of the SAS2LP having different Marvell chips. Mine and one of Rich's have the 9485, which is affected. His other has a 9480, which doesn't seem to be affected. Since yours also isn't affected I'd be very interested to know whether it uses a 9480 too. Quote Link to comment
Squid Posted January 15, 2017 Share Posted January 15, 2017 The problem is that a minority of users with the SAS2 have that problem with IOMMU enabled. I for one don't. Which leads to the question is it the SAS2, is it the motherboard, is it the BIOS. And all of those variables does make it difficult to diagnose / fix I have two SAS2's, the older of which works perfectly, but after the installation of the newer controller (and its subsequent RMA'd replacement) I have started to see the mentioned problems. So I could at least differentiate between a potential functional SAS2 and a dodgy one. What happens if you reflash the SAS2 with the firmware on supermicro's website. The PCIe ID's of the card as shipped vs a reflashed card (even with identical firmware revs) used to change. Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 So much for waiting a week! This replacement controller, for some reason, takes a lot less time to start generating errors. A few hours into a parity check and I got errors and timeouts on one of the disks connected to the new SAS2LP, Jan 15 15:16:45 unRAID kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000336, slot [0]. Jan 15 15:17:17 unRAID kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 15 15:17:17 unRAID kernel: sas: trying to find task 0xffff8801b6907900 Jan 15 15:17:17 unRAID kernel: sas: sas_scsi_find_task: aborting task 0xffff8801b6907900 Jan 15 15:17:17 unRAID kernel: sas: sas_scsi_find_task: task 0xffff8801b6907900 is aborted Jan 15 15:17:17 unRAID kernel: sas: sas_eh_handle_sas_errors: task 0xffff8801b6907900 is aborted Jan 15 15:17:17 unRAID kernel: sas: ata18: end_device-8:3: cmd error handler Jan 15 15:17:17 unRAID kernel: sas: ata15: end_device-8:0: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata16: end_device-8:1: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata17: end_device-8:2: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata18: end_device-8:3: dev error handler Jan 15 15:17:17 unRAID kernel: ata18.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x6 frozen Jan 15 15:17:17 unRAID kernel: ata18.00: failed command: READ FPDMA QUEUED Jan 15 15:17:17 unRAID kernel: ata18.00: cmd 60/00:00:10:9e:1a/04:00:69:00:00/40 tag 18 ncq 524288 in Jan 15 15:17:17 unRAID kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 15 15:17:17 unRAID kernel: ata18.00: status: { DRDY } Jan 15 15:17:17 unRAID kernel: ata18: hard resetting link Jan 15 15:17:17 unRAID kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! Jan 15 15:17:20 unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1430:mvs_I_T_nexus_reset for device[3]:rc= 0 Jan 15 15:17:20 unRAID kernel: ata18.00: configured for UDMA/133 Jan 15 15:17:20 unRAID kernel: ata18: EH complete Jan 15 15:17:20 unRAID kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 So its onto disabling vt-d now, i'll post back my findings John_M. Squid, just so I'm clear, are you saying to update the card with the most up to date firmware version (which its already on), to see if the ID changes, or to see if it helps fix the problem? Quote Link to comment
Squid Posted January 15, 2017 Share Posted January 15, 2017 So much for waiting a week! This replacement controller, for some reason, takes a lot less time to start generating errors. A few hours into a parity check and I got errors and timeouts on one of the disks connected to the new SAS2LP, Jan 15 15:16:45 unRAID kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 00000336, slot [0]. Jan 15 15:17:17 unRAID kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 Jan 15 15:17:17 unRAID kernel: sas: trying to find task 0xffff8801b6907900 Jan 15 15:17:17 unRAID kernel: sas: sas_scsi_find_task: aborting task 0xffff8801b6907900 Jan 15 15:17:17 unRAID kernel: sas: sas_scsi_find_task: task 0xffff8801b6907900 is aborted Jan 15 15:17:17 unRAID kernel: sas: sas_eh_handle_sas_errors: task 0xffff8801b6907900 is aborted Jan 15 15:17:17 unRAID kernel: sas: ata18: end_device-8:3: cmd error handler Jan 15 15:17:17 unRAID kernel: sas: ata15: end_device-8:0: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata16: end_device-8:1: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata17: end_device-8:2: dev error handler Jan 15 15:17:17 unRAID kernel: sas: ata18: end_device-8:3: dev error handler Jan 15 15:17:17 unRAID kernel: ata18.00: exception Emask 0x0 SAct 0x40000 SErr 0x0 action 0x6 frozen Jan 15 15:17:17 unRAID kernel: ata18.00: failed command: READ FPDMA QUEUED Jan 15 15:17:17 unRAID kernel: ata18.00: cmd 60/00:00:10:9e:1a/04:00:69:00:00/40 tag 18 ncq 524288 in Jan 15 15:17:17 unRAID kernel: res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 15 15:17:17 unRAID kernel: ata18.00: status: { DRDY } Jan 15 15:17:17 unRAID kernel: ata18: hard resetting link Jan 15 15:17:17 unRAID kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! Jan 15 15:17:20 unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1430:mvs_I_T_nexus_reset for device[3]:rc= 0 Jan 15 15:17:20 unRAID kernel: ata18.00: configured for UDMA/133 Jan 15 15:17:20 unRAID kernel: ata18: EH complete Jan 15 15:17:20 unRAID kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1 So its onto disabling vt-d now, i'll post back my findings John_M. Squid, just so I'm clear, are you saying to update the card with the most up to date firmware version (which its already on), to see if the ID changes, or to see if it helps fix the problem? both. Curiosity mainly, since when I got my SAS2, unRaid didn't recognize it at the time, so a reflash with the exact same firmware version allowed unRaid to see it. (That problem was early v5 beta and subsequently fixed) Net result was that on shipped boards the ID was different than what supermicro had on the website version of the firmware (with the same #) Certainly doesn't hurt, as I don't have any issues with IOMMU and the SAS2. (and I'm a 9485) Quote Link to comment
Rich Posted January 15, 2017 Author Share Posted January 15, 2017 both. Curiosity mainly, since when I got my SAS2, unRaid didn't recognize it at the time, so a reflash with the exact same firmware version allowed unRaid to see it. (That problem was early v5 beta and subsequently fixed) Net result was that on shipped boards the ID was different than what supermicro had on the website version of the firmware (with the same #) Certainly doesn't hurt, as I don't have any issues with IOMMU and the SAS2. (and I'm a 9485) Ok, I'll give that a go too. Is updating it via the standard method (booting into DOS etc) the recommended way to go, or is there another way to do it? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.