Jump to content

UDMA CRC error count


Recommended Posts

Posted

I have an IBM (Xyratex) HS-1235E that I have been running for about 6 years. It has a 2 TB drive limit. I am almost at capacity and decided to look into a way around this. One of the things I came across was this (https://linustechtips.com/main/topic/983215-burnt-ibm-hs-1235e-9211-8i-6g-conversion/). It has some pictures to help explain if needed. From the backplane there is a expander that goes SFF-8087 to a 4 SAS/SATA breakout that went to the motherboard. I picked up a LSI 9212-4i4e. It contains the newest firmware, P20 in IT mode. Installed last night and it picked up all existing drives (10 2TB reds). I then installed a shucked 8TB drive from an Easystore, the full 8TB was recognized and i was happy camper. I set it to pre-clear and went to bed, checked it in the morning and all seemed good. Checked later in the afternoon and it was in the 80% pre-read and had a bunch of the CRC errors. With the pre-read done the drive is currently sitting at a CRC count of 8150. During the pre-read I tried to access data on the other drives and that went well but after checking the smart data on those I noticed their CRC count went up (were all at 0 to start). Now that the new drive is on zeroing there have been no more errors on it (only at 30% currently). Most of what I read seems to suggests the cable but I don't want to go throwing parts at it if I'm going to end up having to upgrade the whole thing if this won't work. I know most would say to just upgrade based off the age but for what I use it for (90% Plex server/10% backups) I have no need to other then running out of space. I will copy in the Last few log entries. Thanks for any help or advice. 

 

May 5 16:12:41 Tower kernel: sd 9:0:10:0: [sdl] Unaligned partial completion (resid=492540, sector_sz=512)
May 5 16:12:41 Tower kernel: sd 9:0:10:0: [sdl] tag#717 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
May 5 16:12:41 Tower kernel: sd 9:0:10:0: [sdl] tag#717 Sense Key : 0xb [current]
May 5 16:12:41 Tower kernel: sd 9:0:10:0: [sdl] tag#717 ASC=0x47 ASCQ=0x3
May 5 16:12:41 Tower kernel: sd 9:0:10:0: [sdl] tag#717 CDB: opcode=0x88 88 00 00 00 00 03 a3 72 44 00 00 00 04 00 00 00
May 5 16:12:41 Tower kernel: print_req_error: I/O error, dev sdl, sector 15627076608
May 5 16:12:46 Tower kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#957 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#957 CDB: opcode=0x88 88 00 00 00 00 03 a3 7f da 00 00 00 04 40 00 00
May 5 16:12:46 Tower kernel: print_req_error: I/O error, dev sdl, sector 15627966976
May 5 16:12:46 Tower kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#959 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#959 CDB: opcode=0x88 88 00 00 00 00 03 a3 7f de 40 00 00 04 50 00 00
May 5 16:12:46 Tower kernel: print_req_error: I/O error, dev sdl, sector 15627968064
May 5 16:12:46 Tower kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#896 UNKNOWN(0x2003) Result: hostbyte=0x0b driverbyte=0x00
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#896 CDB: opcode=0x88 88 00 00 00 00 03 a3 7f e2 90 00 00 01 70 00 00
May 5 16:12:46 Tower kernel: print_req_error: I/O error, dev sdl, sector 15627969168
May 5 16:12:46 Tower kernel: mpt2sas_cm0: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] Unaligned partial completion (resid=41980, sector_sz=512)
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#958 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#958 Sense Key : 0xb [current]
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#958 ASC=0x47 ASCQ=0x3
May 5 16:12:46 Tower kernel: sd 9:0:10:0: [sdl] tag#958 CDB: opcode=0x88 88 00 00 00 00 03 a3 7f d8 00 00 00 02 00 00 00
May 5 16:12:46 Tower kernel: print_req_error: I/O error, dev sdl, sector 15627966464
May 5 16:12:48 Tower preclear_disk_2SGA5U5J[12683]: Pre-Read: progress - 100% read @ 84 MB/s
May 5 16:12:49 Tower preclear_disk_2SGA5U5J[12683]: Pre-Read: dd - read 8001565319168 of 8001563222016.
May 5 16:12:49 Tower preclear_disk_2SGA5U5J[12683]: Pre-Read: elapsed time - 18:00:31
May 5 16:12:49 Tower preclear_disk_2SGA5U5J[12683]: Pre-Read: dd exit code - 0
May 5 16:12:50 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: emptying the MBR.
May 5 16:12:50 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: dd if=/dev/zero of=/dev/sdl bs=2097152 seek=2097152 count=8001561124864 conv=notrunc iflag=count_bytes,nocache,fullblock oflag=seek_bytes
May 5 16:12:50 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: dd pid [9598]
May 5 16:12:50 Tower rc.diskinfo[10757]: SIGHUP received, forcing refresh of disks info.
May 5 17:22:58 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: progress - 10% zeroed
May 5 18:35:27 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: progress - 20% zeroed
May 5 19:50:25 Tower preclear_disk_2SGA5U5J[12683]: Zeroing: progress - 30% zeroed
May 5 20:18:34 Tower emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/disk_log sdl

 

Posted

So far I am at 70% pre-read. I have encountered two issues. The first is anytime I am in the web interface and click between tabs (going from dashboard to main lets say) it logs this error:

May 6 17:57:01 Tower preclear_disk[15126]: error encountered, exiting...

The number in the bracket is always different.

 

The bigger issue seems to be a few hours ago the card reset, see logs below. This is the only error I have received thus far. The only thing I find while searching is a possible firmware issue and to downgrade to P19. I have ordered a replacement cable just in case but what are your thoughts on the firmware?

 

May  6 16:02:43 Tower kernel: mpt2sas_cm0: fault_state(0x7e23)!
May  6 16:02:43 Tower kernel: mpt2sas_cm0: sending diag reset !!
May  6 16:02:43 Tower kernel: mpt2sas_cm0: diag reset: SUCCESS
May  6 16:02:44 Tower kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
May  6 16:02:44 Tower kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(07.39.02.00)
May  6 16:02:44 Tower kernel: mpt2sas_cm0: Protocol=(
May  6 16:02:44 Tower kernel: Initiator
May  6 16:02:44 Tower kernel: ,Target
May  6 16:02:44 Tower kernel: ), 
May  6 16:02:44 Tower kernel: Capabilities=(
May  6 16:02:44 Tower kernel: TLR
May  6 16:02:44 Tower kernel: ,EEDP
May  6 16:02:44 Tower kernel: ,Snapshot Buffer
May  6 16:02:44 Tower kernel: ,Diag Trace Buffer
May  6 16:02:44 Tower kernel: ,Task Set Full
May  6 16:02:44 Tower kernel: ,NCQ
May  6 16:02:44 Tower kernel: )
May  6 16:02:44 Tower kernel: mpt2sas_cm0: sending port enable !!
May  6 16:02:51 Tower kernel: mpt2sas_cm0: port enable: SUCCESS
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for end-devices: start
May  6 16:02:51 Tower kernel: scsi target9:0:0: handle(0x000a), sas_addr(0x50050cc10a963500)
May  6 16:02:51 Tower kernel: scsi target9:0:0: enclosure logical id(0x50050cc10a96351f), slot(0)
May  6 16:02:51 Tower kernel: scsi target9:0:1: handle(0x000b), sas_addr(0x50050cc10a963501)
May  6 16:02:51 Tower kernel: scsi target9:0:1: enclosure logical id(0x50050cc10a96351f), slot(1)
May  6 16:02:51 Tower kernel: scsi target9:0:2: handle(0x000c), sas_addr(0x50050cc10a963502)
May  6 16:02:51 Tower kernel: scsi target9:0:2: enclosure logical id(0x50050cc10a96351f), slot(2)
May  6 16:02:51 Tower kernel: scsi target9:0:3: handle(0x000d), sas_addr(0x50050cc10a963503)
May  6 16:02:51 Tower kernel: scsi target9:0:3: enclosure logical id(0x50050cc10a96351f), slot(3)
May  6 16:02:51 Tower kernel: scsi target9:0:4: handle(0x000e), sas_addr(0x50050cc10a963504)
May  6 16:02:51 Tower kernel: scsi target9:0:4: enclosure logical id(0x50050cc10a96351f), slot(4)
May  6 16:02:51 Tower kernel: scsi target9:0:5: handle(0x000f), sas_addr(0x50050cc10a963505)
May  6 16:02:51 Tower kernel: scsi target9:0:5: enclosure logical id(0x50050cc10a96351f), slot(5)
May  6 16:02:51 Tower kernel: scsi target9:0:6: handle(0x0010), sas_addr(0x50050cc10a963506)
May  6 16:02:51 Tower kernel: scsi target9:0:6: enclosure logical id(0x50050cc10a96351f), slot(6)
May  6 16:02:51 Tower kernel: scsi target9:0:7: handle(0x0011), sas_addr(0x50050cc10a963507)
May  6 16:02:51 Tower kernel: scsi target9:0:7: enclosure logical id(0x50050cc10a96351f), slot(7)
May  6 16:02:51 Tower kernel: scsi target9:0:8: handle(0x0012), sas_addr(0x50050cc10a963508)
May  6 16:02:51 Tower kernel: scsi target9:0:8: enclosure logical id(0x50050cc10a96351f), slot(8)
May  6 16:02:51 Tower kernel: scsi target9:0:9: handle(0x0013), sas_addr(0x50050cc10a963509)
May  6 16:02:51 Tower kernel: scsi target9:0:9: enclosure logical id(0x50050cc10a96351f), slot(9)
May  6 16:02:51 Tower kernel: scsi target9:0:10: handle(0x0014), sas_addr(0x50050cc10a96350b)
May  6 16:02:51 Tower kernel: scsi target9:0:10: enclosure logical id(0x50050cc10a96351f), slot(11)
May  6 16:02:51 Tower kernel: scsi target9:0:11: handle(0x0015), sas_addr(0x50050cc10a96351e)
May  6 16:02:51 Tower kernel: scsi target9:0:11: enclosure logical id(0x50050cc10a96351f), slot(24)
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for end-devices: complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for end-devices: start
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for PCIe end-devices: complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for expanders: start
May  6 16:02:51 Tower kernel:     expander present: handle(0x0009), sas_addr(0x50050cc10a96351f)
May  6 16:02:51 Tower kernel: mpt2sas_cm0: search for expanders: complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0: _base_fault_reset_work: hard reset: success
May  6 16:02:51 Tower kernel: mpt2sas_cm0: removing unresponding devices: start
May  6 16:02:51 Tower kernel: mpt2sas_cm0: removing unresponding devices: end-devices
May  6 16:02:51 Tower kernel: mpt2sas_cm0:  Removing unresponding devices: pcie end-devices
May  6 16:02:51 Tower kernel: mpt2sas_cm0: removing unresponding devices: expanders
May  6 16:02:51 Tower kernel: mpt2sas_cm0: removing unresponding devices: complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0: scan devices: start
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     scan devices: expanders start
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     break from expander scan: ioc_status(0x0022), loginfo(0x310f0400)
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     scan devices: expanders complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     scan devices: end devices start
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     break from end device scan: ioc_status(0x0022), loginfo(0x310f0400)
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     scan devices: end devices complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     scan devices: pcie end devices start
May  6 16:02:51 Tower kernel: mpt2sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
May  6 16:02:51 Tower kernel: mpt2sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     break from pcie end device scan: ioc_status(0x0022), loginfo(0x3003011d)
May  6 16:02:51 Tower kernel: mpt2sas_cm0:     pcie devices: pcie end devices complete
May  6 16:02:51 Tower kernel: mpt2sas_cm0: scan devices: complete
May  6 16:02:51 Tower kernel: sd 9:0:10:0: Power-on or device reset occurred
May  6 16:03:04 Tower rc.diskinfo[11368]: SIGHUP received, forcing refresh of disks info.
May  6 16:11:27 Tower kernel: sd 9:0:0:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:5:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:4:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:2:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:6:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:3:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:7:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:8:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:9:0: Power-on or device reset occurred
May  6 16:11:27 Tower kernel: sd 9:0:1:0: Power-on or device reset occurred
May  6 16:11:32 Tower kernel: sd 9:0:9:0: [sdk] tag#99 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:32 Tower kernel: sd 9:0:9:0: [sdk] tag#99 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:32 Tower kernel: print_req_error: I/O error, dev sdk, sector 3907028992
May  6 16:11:37 Tower kernel: sd 9:0:8:0: [sdj] tag#3198 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:37 Tower kernel: sd 9:0:8:0: [sdj] tag#3198 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:37 Tower kernel: print_req_error: I/O error, dev sdj, sector 3907028992
May  6 16:11:42 Tower kernel: sd 9:0:2:0: [sdd] tag#101 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:42 Tower kernel: sd 9:0:2:0: [sdd] tag#101 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:42 Tower kernel: print_req_error: I/O error, dev sdd, sector 3907028992
May  6 16:11:47 Tower kernel: sd 9:0:0:0: [sdb] tag#1128 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:47 Tower kernel: sd 9:0:0:0: [sdb] tag#1128 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:47 Tower kernel: print_req_error: I/O error, dev sdb, sector 3907028992
May  6 16:11:52 Tower kernel: sd 9:0:6:0: [sdh] tag#118 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:52 Tower kernel: sd 9:0:6:0: [sdh] tag#118 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:52 Tower kernel: print_req_error: I/O error, dev sdh, sector 3907028992
May  6 16:11:57 Tower kernel: sd 9:0:4:0: [sdf] tag#108 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:11:57 Tower kernel: sd 9:0:4:0: [sdf] tag#108 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:11:57 Tower kernel: print_req_error: I/O error, dev sdf, sector 3907028992
May  6 16:12:02 Tower kernel: sd 9:0:7:0: [sdi] tag#1344 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:12:02 Tower kernel: sd 9:0:7:0: [sdi] tag#1344 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:12:02 Tower kernel: print_req_error: I/O error, dev sdi, sector 3907028992
May  6 16:12:12 Tower kernel: sd 9:0:1:0: [sdc] tag#1344 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:12:12 Tower kernel: sd 9:0:1:0: [sdc] tag#1344 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:12:12 Tower kernel: print_req_error: I/O error, dev sdc, sector 3907028992
May  6 16:12:12 Tower kernel: sd 9:0:5:0: [sdg] tag#105 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:12:12 Tower kernel: sd 9:0:5:0: [sdg] tag#105 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:12:12 Tower kernel: print_req_error: I/O error, dev sdg, sector 3907028992
May  6 16:12:17 Tower kernel: sd 9:0:3:0: [sde] tag#2595 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
May  6 16:12:17 Tower kernel: sd 9:0:3:0: [sde] tag#2595 CDB: opcode=0x28 28 00 e8 e0 88 00 00 00 08 00
May  6 16:12:17 Tower kernel: print_req_error: I/O error, dev sde, sector 3907028992
May  6 16:12:17 Tower rc.diskinfo[11368]: SIGHUP received, forcing refresh of disks info.

 

Thanks for any help.

 

Posted
6 hours ago, Mike012086 said:

The only thing I find while searching is a possible firmware issue and to downgrade to P19.

Not impossible, but never heard of that, and most users including myself use this firmware.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...