Jump to content

jbuszkie

Members
  • Posts

    696
  • Joined

  • Last visited

Everything posted by jbuszkie

  1. I take it back.. it's not dead.. but here are more errors.. this occurs several more times in the syslog Mar 6 13:14:55 Tower kernel: sd 0:0:3:0: [sdi] ASC=0x0 ASCQ=0x0 (Drive related) Mar 6 13:14:55 Tower kernel: sd 0:0:3:0: [sdi] CDB: cdb[0]=0x28: 28 00 80 43 a1 47 00 02 00 00 (Drive related) Mar 6 13:14:55 Tower kernel: end_request: I/O error, dev sdi, sector 2151915847 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915784/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915792/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915800/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915808/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915816/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915824/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915832/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915840/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915848/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915856/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915864/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915872/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915880/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915888/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915896/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915904/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915912/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915920/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915928/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915936/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915944/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915952/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915960/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915968/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915976/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915984/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151915992/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916000/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916008/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916016/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916024/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916032/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916040/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916048/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916056/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916064/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916072/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916080/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916088/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916096/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916104/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916112/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916120/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916128/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916136/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916144/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916152/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916160/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916168/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916176/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916184/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916192/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916200/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916208/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916216/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916224/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916232/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916240/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916248/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916256/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916264/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916272/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916280/3, count: 1 (Errors) Mar 6 13:14:55 Tower kernel: md: disk4 read error (Errors) Mar 6 13:14:55 Tower kernel: handle_stripe read error: 2151916288/3, count: 1 (Errors) Mar 6 13:15:26 Tower kernel: sas: command 0xf3c4d480, task 0xf19223c0, timed out: BLK_EH_NOT_HANDLED (Drive related) Mar 6 13:15:26 Tower kernel: sas: Enter sas_scsi_recover_host (Drive related) Mar 6 13:15:26 Tower kernel: sas: trying to find task 0xf19223c0 (Drive related) Mar 6 13:15:26 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf19223c0 (Drive related) Consequently disk4 is full... if that means anything.. So my problem might have nothing to do with my monthly parity check.. This might be new issue
  2. Damn!! It crashed again.. This time I was somewhat watching... and I was able to get part of the syslog. There are some errors in there.. and before it crashed, it started to slow down (the MB/s that is) Now I can't get to it at all. What do these errors mean? Is this adapter related or drive related? ATA10 is Mar 6 08:07:52 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1388:found dev[3:5] is gone. (System) Mar 6 08:07:52 Tower kernel: sas: sas_ata_phy_reset: Found ATA device. (Drive related) Mar 6 08:07:52 Tower kernel: ata10.00: ATA-7: SAMSUNG HD154UI, 1AG01118, max UDMA7 (Drive related) Mar 6 08:07:52 Tower kernel: ata10.00: 2930277168 sectors, multi 0: LBA48 NCQ (depth 31/32) (Drive related) Mar 6 08:07:52 Tower kernel: sdh: sdh1 (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:2:0: [sdh] Attached SCSI disk (Drive related) Mar 6 08:07:52 Tower kernel: ata10.00: configured for UDMA/133 (Drive related) Mar 6 08:07:52 Tower kernel: scsi 0:0:3:0: Direct-Access ATA SAMSUNG HD154UI 1AG0 PQ: 0 ANSI: 5 (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:3:0: [sdi] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB) (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:3:0: [sdi] Write Protect is off (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:3:0: [sdi] Mode Sense: 00 3a 00 00 (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:3:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA (Drive related) Mar 6 08:07:52 Tower kernel: sas: DONE DISCOVERY on port 3, pid:822, result:0 (Drive related) Mar 6 08:07:52 Tower kernel: sdi: sdi1 (Drive related) Mar 6 08:07:52 Tower kernel: sd 0:0:3:0: [sdi] Attached SCSI disk (Drive related) There errors are here: Mar 6 08:24:29 Tower emhttp: shcmd (66): /usr/sbin/hdparm -y /dev/sdf $stuff$> /dev/null (Drive related) Mar 6 11:53:48 Tower emhttp: shcmd (67): /usr/sbin/hdparm -y /dev/sdf $stuff$> /dev/null (Drive related) Mar 6 12:50:18 Tower kernel: sas: command 0xf1980600, task 0xf19223c0, timed out: BLK_EH_NOT_HANDLED (Drive related) Mar 6 12:50:18 Tower kernel: sas: Enter sas_scsi_recover_host (Drive related) Mar 6 12:50:18 Tower kernel: sas: trying to find task 0xf19223c0 (Drive related) Mar 6 12:50:18 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf19223c0 (Drive related) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=f7400000 task=f19223c0 slot=f74115d4 slot_idx=x0 (System) Mar 6 12:50:18 Tower kernel: sas: sas_scsi_find_task: querying task 0xf19223c0 (Drive related) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 (System) Mar 6 12:50:18 Tower kernel: sas: sas_scsi_find_task: task 0xf19223c0 failed to abort (Minor Issues) Mar 6 12:50:18 Tower kernel: sas: task 0xf19223c0 is not at LU: I_T recover (Drive related) Mar 6 12:50:18 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 (Drive related) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x89800. (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001 (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy3 Unplug Notice (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1081 (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x10000 (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[3] (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 3 attach dev info is 0 (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 3 attach sas addr is 3 (System) Mar 6 12:50:18 Tower kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 3 byte dmaded. (System) Mar 6 12:50:18 Tower kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! (Drive related) Mar 6 12:50:20 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[3]:rc= 0 (System) Mar 6 12:50:20 Tower kernel: sas: I_T 0300000000000000 recovered (Drive related) Mar 6 12:50:20 Tower kernel: sas: sas_ata_task_done: SAS error 8d (Errors) Mar 6 12:50:20 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:50:20 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:50:20 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:50:20 Tower kernel: sas: --- Exit sas_scsi_recover_host (Drive related) Mar 6 12:50:20 Tower kernel: sas: sas_to_ata_err: Saw error 2. What to do? (Errors) Mar 6 12:50:20 Tower kernel: sas: sas_ata_task_done: SAS error 2 (Errors) Mar 6 12:50:20 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:50:20 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:50:20 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:50:51 Tower kernel: sas: command 0xf1980600, task 0xf19223c0, timed out: BLK_EH_NOT_HANDLED (Drive related) Mar 6 12:50:51 Tower kernel: sas: Enter sas_scsi_recover_host (Drive related) Mar 6 12:50:51 Tower kernel: sas: trying to find task 0xf19223c0 (Drive related) Mar 6 12:50:51 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf19223c0 (Drive related) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=f7400000 task=f19223c0 slot=f74115d4 slot_idx=x0 (System) Mar 6 12:50:51 Tower kernel: sas: sas_scsi_find_task: querying task 0xf19223c0 (Drive related) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 (System) Mar 6 12:50:51 Tower kernel: sas: sas_scsi_find_task: task 0xf19223c0 failed to abort (Minor Issues) Mar 6 12:50:51 Tower kernel: sas: task 0xf19223c0 is not at LU: I_T recover (Drive related) Mar 6 12:50:51 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 (Drive related) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x89800. (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001001 (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy3 Unplug Notice (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001081 (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x10000 (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[3] (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 3 attach dev info is 0 (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 3 attach sas addr is 3 (System) Mar 6 12:50:51 Tower kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 3 byte dmaded. (System) Mar 6 12:50:51 Tower kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! (Drive related) Mar 6 12:50:53 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[3]:rc= 0 (System) Mar 6 12:50:53 Tower kernel: sas: I_T 0300000000000000 recovered (Drive related) Mar 6 12:50:53 Tower kernel: sas: sas_ata_task_done: SAS error 8d (Errors) Mar 6 12:50:53 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:50:53 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:50:53 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:50:53 Tower kernel: sas: --- Exit sas_scsi_recover_host (Drive related) Mar 6 12:50:53 Tower kernel: sas: sas_to_ata_err: Saw error 2. What to do? (Errors) Mar 6 12:50:53 Tower kernel: sas: sas_ata_task_done: SAS error 2 (Errors) Mar 6 12:50:53 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:50:53 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:50:53 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:51:24 Tower kernel: sas: command 0xf1980600, task 0xf19223c0, timed out: BLK_EH_NOT_HANDLED (Drive related) Mar 6 12:51:24 Tower kernel: sas: Enter sas_scsi_recover_host (Drive related) Mar 6 12:51:24 Tower kernel: sas: trying to find task 0xf19223c0 (Drive related) Mar 6 12:51:24 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf19223c0 (Drive related) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=f7400000 task=f19223c0 slot=f74115d4 slot_idx=x0 (System) Mar 6 12:51:24 Tower kernel: sas: sas_scsi_find_task: querying task 0xf19223c0 (Drive related) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 (System) Mar 6 12:51:24 Tower kernel: sas: sas_scsi_find_task: task 0xf19223c0 failed to abort (Minor Issues) Mar 6 12:51:24 Tower kernel: sas: task 0xf19223c0 is not at LU: I_T recover (Drive related) Mar 6 12:51:24 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 (Drive related) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x89800. (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001 (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy3 Unplug Notice (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1081 (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x10000 (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[3] (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 3 attach dev info is 0 (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 3 attach sas addr is 3 (System) Mar 6 12:51:24 Tower kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 3 byte dmaded. (System) Mar 6 12:51:24 Tower kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! (Drive related) Mar 6 12:51:26 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[3]:rc= 0 (System) Mar 6 12:51:26 Tower kernel: sas: I_T 0300000000000000 recovered (Drive related) Mar 6 12:51:26 Tower kernel: sas: sas_ata_task_done: SAS error 8d (Errors) Mar 6 12:51:26 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:51:26 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:51:26 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:51:26 Tower kernel: sas: --- Exit sas_scsi_recover_host (Drive related) Mar 6 12:51:26 Tower kernel: sas: sas_to_ata_err: Saw error 2. What to do? (Errors) Mar 6 12:51:26 Tower kernel: sas: sas_ata_task_done: SAS error 2 (Errors) Mar 6 12:51:26 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:51:26 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:51:26 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:52:03 Tower kernel: sas: command 0xf3e56240, task 0xf19223c0, timed out: BLK_EH_NOT_HANDLED (Drive related) Mar 6 12:52:03 Tower kernel: sas: Enter sas_scsi_recover_host (Drive related) Mar 6 12:52:03 Tower kernel: sas: trying to find task 0xf19223c0 (Drive related) Mar 6 12:52:03 Tower kernel: sas: sas_scsi_find_task: aborting task 0xf19223c0 (Drive related) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1703:<7>mv_abort_task() mvi=f7400000 task=f19223c0 slot=f74115d4 slot_idx=x0 (System) Mar 6 12:52:03 Tower kernel: sas: sas_scsi_find_task: querying task 0xf19223c0 (Drive related) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1632:mvs_query_task:rc= 5 (System) Mar 6 12:52:03 Tower kernel: sas: sas_scsi_find_task: task 0xf19223c0 failed to abort (Minor Issues) Mar 6 12:52:03 Tower kernel: sas: task 0xf19223c0 is not at LU: I_T recover (Drive related) Mar 6 12:52:03 Tower kernel: sas: I_T nexus reset for dev 0300000000000000 (Drive related) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x89800. (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001001 (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2111:phy3 Unplug Notice (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x1001081 (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2083:port 3 ctrl sts=0x199800. (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2085:Port 3 irq sts = 0x10000 (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 2138:notify plug in on phy[3] (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1224:port 3 attach dev info is 0 (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1226:port 3 attach sas addr is 3 (System) Mar 6 12:52:03 Tower kernel: drivers/scsi/mvsas/mv_sas.c 378:phy 3 byte dmaded. (System) Mar 6 12:52:03 Tower kernel: sas: sas_form_port: phy3 belongs to port3 already(1)! (Drive related) Mar 6 12:52:05 Tower kernel: drivers/scsi/mvsas/mv_sas.c 1586:mvs_I_T_nexus_reset for device[3]:rc= 0 (System) Mar 6 12:52:05 Tower kernel: sas: I_T 0300000000000000 recovered (Drive related) Mar 6 12:52:05 Tower kernel: sas: sas_ata_task_done: SAS error 8d (Errors) Mar 6 12:52:05 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:52:05 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:52:05 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors) Mar 6 12:52:05 Tower kernel: sas: --- Exit sas_scsi_recover_host (Drive related) Mar 6 12:52:05 Tower kernel: sas: sas_to_ata_err: Saw error 2. What to do? (Errors) Mar 6 12:52:05 Tower kernel: sas: sas_ata_task_done: SAS error 2 (Errors) Mar 6 12:52:05 Tower kernel: ata10: translated ATA stat/err 0x01/04 to SCSI SK/ASC/ASCQ 0xb/00/00 (Drive related) Mar 6 12:52:05 Tower kernel: ata10: status=0x01 { Error } (Errors) Mar 6 12:52:05 Tower kernel: ata10: error=0x04 { DriveStatusError } (Errors)
  3. Maybe I'll try that... It crashed again last night. Probably when some on my nightly backups were trying to go or when the mover script kicked in It's at 31% now. I won't really access the array until its done... Or at least until It's past some of the smaller drives... The 1TB should be done soon then the 1.5s. Maybe I'll experiment then. Jim
  4. I found a CORSAIR Gaming Series GS600 at work that I can borrow for a couple of days. Right now it $82.48 with a $5MIR till tomorrow... I'll give this a try tonight. What a pain! I'm not looking forward to swapping out the PSU! But I guess I really shouldn't be cheap on the PSU (or any component) for a system that I need to have running 24/7 that has things I don't really want to lose!
  5. The only other PSU I have at home is an older single rail 25A supply. Not sure that's beefy enough.. Maybe I should try it anyway..
  6. I just wish there was a way to rule out the beta code before I plunk down $$ for a good PSU!
  7. It originally used 1 rail for the MB, 2 for PCIE, and 1 for drives. I rewired (not mickey moused.. soldered correctly) half of the drives to use one of PCIE power rails (which I have no need of)
  8. That's all good. It went into a new case a couple months ago. Everything is clean. (problem occurred with old case as well - same PSU though)
  9. Why do you say that? 5 drives * 2A/drive = 10A (ok even if 3A/drive to get to 15A) That's still below the 20A/rail
  10. What is the lowest you guys have seen the SeaSonic X 650 PSU go for? I having some issues with my rig (That's a seperate thread!) and I'm trying to see how much it would cost me to eliminate my PSU as the issue. Right now the x650 is ~$140 w/FS Does is go much lower than that? Raj, Maybe we need a dedicated PSU good deals thread like the ram and others! Jim
  11. And it has, in the past, done a manual full parity check to completion. (with no other disk activity going on) What was different this time is I was writing to the array (not the cache disk) at the same time
  12. I actually have the drives split across two of the 12V rails. So I have 5 on 1 and 5 on the other. So each 5 drives has (in theory) 20A available to it (there are 4 20A rails total)
  13. Ok.. It's still doing this! But today it also crashed while doing a regular parity check. Sometime along the way.. it crashed. But the difference is I was also trying to do other things as well. Today I noticed that UnRaid wasn't responding. hmm.. beginning of the month. Probably the auto parity check! So I power cycled and it started the parity check. It seemed to be humming along. So I decided to rip a DVD to the server. Sometime after that, UnRaid crashed again! I have a 630W power supply (This one) Which half my drives are on one 12V rail and the others are on another rail. My issues started when I went to the 5.0 Beta but I also added more drives at the same time! So I'm scratching my head to figure this out! There is a part of me that thinks my PSU just plain sucks! But I don't want to drop >$100 on a new PSU to test that theory. But there is also a part of me that thinks the Beta my be causing my issues! Any thoughts on how to diagnose? I can't go back to a non beta due to my inclusion of 3TB drives. Jim P.S. I have 10 drives (8 data, 1 parity, 1 cache) almost all green drives.
  14. The last two I got... I got through a place called 3btech.. http://3btech.net/noss5baysasa.html I had no issues with them. and it didn't take 2 months to ship.. but.. I also got mine a while ago.. it came dropped shipped from Norco so maybe Norco was OOS too and no one had any??
  15. are they normally not in stock at Newegg?
  16. I just got a nice 2200VA APC ups. I'm trying to figure out what's the best way to set it up. It will hve two machines on it. 1. my unraid server 2. a Win 7 HTPC machine. So what's the best way to set it up so it will turn off both machines?? 1. Use Power chute on the win 7 machine and some how have it be able to power off the unraid machine (is this possible?) 2. Use the windows version of apcupsd and have it remotely power off unraid (is this even possible either) 3. Use apcupsd on UnRaid and have it remotely power off the win 7 machine either through powerchute or apcupsd running on the win7 machine. Now I don't know which of these options are even possible! What are your guys' thoughts? Thanks, Jim
  17. Not sure if this is a good deal or not.. but it's a synchronous SSD. http://www.buy.com/prod/crucial-128gb-m4-2-5-sata-iii-mlc-internal-solid-state-drive/221150373.html No MIB +FS
  18. Yes. It runs if I hit the parity check button. No I haven't tried the unmenu version. I'll have to look into that.
  19. Well... It happened again. I think I'll disable this and have it e-mail me a reminder once a month to do it manually! Maybe when I have some spare cycles I'll try it manually from the command line mimicking the cron script. *pout*
  20. I don't know the exact time. But it was somewhere between 8 and 12 hours And my cpu is not a strong as yours.
  21. That's not what I was referring to.. I was referring to putting my old disk back in service as a data disk. I assume now I have to preclear it if I want to quickly add it to the array! It would be nice if UnRaid would see that I wanted to add a bigger disk as a data disk and allow me to swap the bigger disk as the parity and make the old disk a data disk at the same time! Now I had to create the new parity disk with the parity synch (which finished Yeah!) and now I have my old parity disk which I now have to pre-clear to add it to the array (if I want to do it quickly) Jim
  22. So I guess that mean I have to preclear the old parity drive before it can go into the system (quickly) Jim
  23. You certainly can use an old parity disk as a data disk... but it's a two step process!
  24. I thought there was a way to do it on one step? It would add the old parity disk as a data disk in the same step as adding the parity. Maybe a beta 5 feature? was I just imagining it?? Jim
×
×
  • Create New...