sidezero

Members
  • Posts

    49
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

sidezero's Achievements

Rookie

Rookie (2/14)

0

Reputation

  1. I had a failed 2TB drive which I was waiting on the replacement for. I swapped it with a new 3TB drive yesterday afternoon as I have upgraded my parity and began using 3TB drives as I filled my chassis and started a rebuild of drive 11. Woke up to everything being unresponsive and this in the syslog. Telnet is still responsive to the server but that's about it. md: recovery thread woken up ... md: recovery thread rebuilding disk11 ... md: using 1536k window, over a total of 2930266532 blocks. sd 8:0:3:0: [sdq] command ecb28c00 timed out sd 8:0:3:0: [sdq] command ecb28300 timed out sd 8:0:3:0: [sdq] command ecb28840 timed out sd 8:0:3:0: [sdq] command f76bbe40 timed out sd 8:0:3:0: [sdq] command f239d9c0 timed out sas: Enter sas_scsi_recover_host busy: 5 failed: 5 sas: trying to find task 0xf1c8e000 sas: sas_scsi_find_task: aborting task 0xf1c8e000 sas: sas_scsi_find_task: task 0xf1c8e000 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e000 is aborted sas: trying to find task 0xf1c8e200 sas: sas_scsi_find_task: aborting task 0xf1c8e200 sas: sas_scsi_find_task: task 0xf1c8e200 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e200 is aborted sas: trying to find task 0xf1c8e900 sas: sas_scsi_find_task: aborting task 0xf1c8e900 sas: sas_scsi_find_task: task 0xf1c8e900 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8e900 is aborted sas: trying to find task 0xf1c3cc00 sas: sas_scsi_find_task: aborting task 0xf1c3cc00 sas: sas_scsi_find_task: task 0xf1c3cc00 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c3cc00 is aborted sas: trying to find task 0xf1c8ec00 sas: sas_scsi_find_task: aborting task 0xf1c8ec00 sas: sas_scsi_find_task: task 0xf1c8ec00 is aborted sas: sas_eh_handle_sas_errors: task 0xf1c8ec00 is aborted sas: ata18: end_device-8:3: cmd error handler sas: ata15: end_device-8:0: dev error handler sas: ata16: end_device-8:1: dev error handler sas: ata17: end_device-8:2: dev error handler sas: ata18: end_device-8:3: dev error handler ata18.00: exception Emask 0x0 SAct 0x3e SErr 0x0 action 0x6 frozen sas: ata19: end_device-8:4: dev error handler ata18.00: failed command: WRITE FPDMA QUEUED sas: ata20: end_device-8:5: dev error handler ata18.00: cmd 61/00:00:78:99:b2/02:00:a3:00:00/40 tag 1 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:a1:b2/02:00:a3:00:00/40 tag 2 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9b:b2/02:00:a3:00:00/40 tag 3 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) sas: ata21: end_device-8:6: dev error handler ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9d:b2/02:00:a3:00:00/40 tag 4 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18.00: failed command: WRITE FPDMA QUEUED ata18.00: cmd 61/00:00:78:9f:b2/02:00:a3:00:00/40 tag 5 ncq 262144 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata18.00: status: { DRDY } ata18: hard resetting link drivers/scsi/mvsas/mv_sas.c 1527:mvs_I_T_nexus_reset for device[3]:rc= 0 sas: sas_ata_task_done: SAS error 8a sas: sas_ata_task_done: SAS error 8a ata18.00: both IDENTIFYs aborted, assuming NODEV ata18.00: revalidation failed (errno=-2) mvsas 0000:02:00.0: Phy4 : No sig fis sas: sas_form_port: phy4 belongs to port3 already(1)! ata18: hard resetting link ata18.00: configured for UDMA/133 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18.00: device reported invalid CHS sector 0 ata18: EH complete sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 md: sync done. time=43583sec md: recovery thread sync completion status: 0 Not sure what my next step should be at this point.
  2. Been running the rc6 test for about 3 1/2 days now. No issues thus far with it and I've ran a parity check and the mover script as well without any issues. Had tons of issues with all releases after B11 previously.
  3. Been running this for about a day and a half and it's been completely stable. I was forced to run beta 11 previously because I kept having issues running 2x SASLP-MV8s. I've ran a parity check as well without any issues. Will continue to test and update if anything changes.
  4. I have the exact same problem here. I was forced to downgrade back to beta 11 which is confirmed stable on my system. Running an i3 540, Supermicro MBD-X8SIL-F-O, and 2x Supermicro AOC-SASLP-MV8s Looking forward to 5.0-rc4-scst-1.
  5. I'm running 2 AOC-SASLP-MV8s on RC4. Been about 10 hours now and no issues yet that I've seen.
  6. I too have this occur on a regular basis. Usually when I have sab running and am listening to a MP3, then attempt to bring up the SMB shares everything freezes for a good 10-15 seconds sometimes. The explorer window will hang and my MP3 currently playing will just stop. It resumes playing as soon as the explorer window catches up and lists all the SMB shares. The MP3 stops playing in VLC, spotify, and itunes. The PC is running Windows 7. It has happened in 5.0 beta 11, beta12a, and 14. I don't remember if it ever happened on 4.7.
  7. I just ran into a bug as well. Found no network connectivity to my server so logged into the remote console and found the following spamming the console: e1000e: eth0 NIC Link is UP 1000 Mbps Full Duplex, Flow Control: Rx/Tx e1000e 0000:03:00.0: eth0: Reset adapter It's just spamming this about every second or so and has for a few hours now.
  8. It eventually threw me back to a login prompt. I can login and see a a mdrecoveryd process, lots of spinupd processes, and a sync processs running. Webgui still just says "system is restarting"
  9. It also has a bunch of mdcmd import commands above the spinup ones. webgui is down and the only share over smb from the server right now is flash. It's been sitting for about 10-15 minutes now without any progress past this point.
  10. All addons are stopped/disabled at the moment. To reboot I basically went into the webgui, told it to stop the array, then hit the reboot checkbox and clicked reboot.
  11. I just went to reboot my server to update simplefeatures (running 5.0 beta 11) and it did this thing where during the shutdown sequence it gets to a place where it shows mcmd (22): spinup 0 and a bunch of other similar entries and then just hangs with a blinking cursor and sits forever. I've noticed this a few times in the past. Sometimes it reboots fine however. Anyone else ever seen this?
  12. Correct. You want to let the current TTL run down on all the name servers out there so they cache the new value. Once the new value is cached you should be good to make your change.
  13. 30-60 minutes for the DNS TTL still seems like a bit much.. You should be able to have them lower it down to 5 minutes.
  14. Hello, I was randomly checking my server this morning and found the following entries: Mar 13 04:02:42 orbit logger: mover finished Mar 13 04:22:41 orbit kernel: mdcmd (729): spindown 5 Mar 13 05:04:58 orbit kernel: mdcmd (730): spindown 1 Mar 13 05:04:59 orbit kernel: mdcmd (731): spindown 2 Mar 13 05:04:59 orbit kernel: mdcmd (732): spindown 3 Mar 13 05:05:00 orbit kernel: mdcmd (733): spindown 4 Mar 13 05:05:00 orbit kernel: mdcmd (734): spindown 6 Mar 13 05:05:00 orbit kernel: mdcmd (735): spindown 7 Mar 13 05:05:01 orbit kernel: mdcmd (736): spindown 8 Mar 13 05:05:01 orbit kernel: mdcmd (737): spindown 9 Mar 13 05:05:02 orbit kernel: mdcmd (738): spindown 11 Mar 13 05:05:24 orbit kernel: mdcmd (739): spindown 0 Mar 13 05:05:24 orbit kernel: mdcmd (740): spindown 10 Mar 13 06:00:37 orbit kernel: sas: command 0xefb14e40, task 0xf0332c80, timed out: BLK_EH_NOT_HANDLED Mar 13 06:00:37 orbit kernel: sas: Enter sas_scsi_recover_host Mar 13 06:00:37 orbit kernel: sas: trying to find task 0xf0332c80 Mar 13 06:00:37 orbit kernel: sas: sas_scsi_find_task: aborting task 0xf0332c80 Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=f7260000 task=f0332c80 slot=f727163c slot_idx=x2 Mar 13 06:00:37 orbit kernel: sas: sas_scsi_find_task: querying task 0xf0332c80 Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 Mar 13 06:00:37 orbit kernel: sas: sas_scsi_find_task: task 0xf0332c80 failed to abort Mar 13 06:00:37 orbit kernel: sas: task 0xf0332c80 is not at LU: I_T recover Mar 13 06:00:37 orbit kernel: sas: I_T nexus reset for dev 0000000000000000 Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x89800. Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1001 Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy0 Unplug Notice Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800. Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x11081 Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[0] Mar 13 06:00:37 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2163:plugin interrupt but phy0 is gone Mar 13 06:00:39 orbit kernel: mvsas 0000:02:00.0: Phy0 : No sig fis Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2024:phy0 Attached Device Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x89800. Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x1001 Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy0 Unplug Notice Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800. Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x81 Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 0 ctrl sts=0x199800. Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 0 irq sts = 0x10000 Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[0] Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 0 attach dev info is 40000 Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 0 attach sas addr is 0 Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 0 byte dmaded. Mar 13 06:00:39 orbit kernel: sas: sas_form_port: phy0 belongs to port0 already(1)! Mar 13 06:00:39 orbit kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[0]:rc= 0 Mar 13 06:00:39 orbit kernel: sas: I_T 0000000000000000 recovered Mar 13 06:00:39 orbit kernel: sas: sas_ata_task_done: SAS error 8d Mar 13 06:00:39 orbit kernel: ata15: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Mar 13 06:00:39 orbit kernel: ata15: status=0x01 { Error } Mar 13 06:00:39 orbit kernel: ata15: error=0x04 { DriveStatusError } Mar 13 06:00:39 orbit kernel: sas: --- Exit sas_scsi_recover_host Is this just a drive not responding briefly? Anything to be concerned about? I'm running 5.0 B11 currently and haven't seen any errors since March 1st prior to this. Thanks
  15. From what I saw configurations with 1 of the controllers are fine, it's only people with 2 of them that experience the bug.