twg Posted October 19, 2018 Share Posted October 19, 2018 (edited) I recently had a data drive quit on me, at least Unraid said so, so I replaced it and it went thru a data rebuild. In the process, the server froze, so I rebooted it. It completed rebuilding the data drive and when it finished, I saw the following message: Event: Unraid Parity sync / Data rebuild Subject: Notice [TOWER] - Parity sync / Data rebuild finished (11640829 errors) Description: Duration: 1 day, 6 minutes, 23 seconds. Average speed: 92.2 MB/s Importance: warning What does it mean when it lists all those errors, are those errors in the drive rebuild ? ie. there's bad data ? When I check the drive log, I get a whole bunch of these: Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#0 CDB: opcode=0x88 88 00 00 00 00 03 9d f4 24 a0 00 00 02 00 00 00 Oct 18 20:45:35 Tower kernel: print_req_error: I/O error, dev sdo, sector 15534924960 Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#1 CDB: opcode=0x88 88 00 00 00 00 03 9d f4 26 a0 00 00 02 00 00 00 Oct 18 20:45:35 Tower kernel: print_req_error: I/O error, dev sdo, sector 15534925472 Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 18 20:45:35 Tower kernel: sd 13:0:5:0: [sdo] tag#2 CDB: opcode=0x88 88 00 00 00 00 03 9d f4 28 a0 00 00 02 00 00 00 Oct 18 20:45:35 Tower kernel: print_req_error: I/O error, dev sdo, sector 15534925984 I've attached the full drive log. So I was getting some really weird issues, multiple drives would drop out on me, different drives everytime I reboot... open my server, seemed like some power cables were loose, so I replugged those in... still multiple drives failing on me, it seems like it's coming from one drive controller, the AOC-SASLP-MV8 controller... luckily I had a spare AOC-SAS2LP-MV8 controller, so I plugged that in... I see almost all of my drives... except my parity drive is not listed... I hear a drive struggling to seek properly, and sure enough it's my parity drive... it seems my parity drive has died... Now I'm not sure what to do... did my original data drive rebuild properly ? considering the errors I got ? I still have the failed data drive... suggestions ? I have a spare drive I can replace the parity drive but hesitant to do anything at this point that may be permanent and damage my data... help!! Edited October 31, 2020 by twg Quote Link to comment
JorgeB Posted October 19, 2018 Share Posted October 19, 2018 Please post your diagnostics: Tools -> Diagnostics Quote Link to comment
twg Posted October 20, 2018 Author Share Posted October 20, 2018 I put a new drive in to replace the failed parity drive and it finished rebuilding the parity drive. I decided to buy another parity and put 2 parity drives to cover myself... and within 2 hours of adding the 2nd parity drive, another one of my disk redballed. The relevent part of the log shows similar errors: Oct 20 14:21:48 Tower emhttpd: shcmd (217): echo 128 > /sys/block/sdp/queue/nr_requests Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 82 30 e8 10 00 00 02 f8 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184243216 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#1 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#1 CDB: opcode=0x88 88 00 00 00 00 00 82 30 e6 d0 00 00 01 40 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184242896 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#2 CDB: opcode=0x88 88 00 00 00 00 00 82 30 e2 d0 00 00 04 00 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184241872 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#3 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#3 CDB: opcode=0x88 88 00 00 00 00 00 82 30 e1 90 00 00 01 40 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184241552 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#4 CDB: opcode=0x88 88 00 00 00 00 00 82 30 dd 90 00 00 04 00 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184240528 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#5 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#5 CDB: opcode=0x88 88 00 00 00 00 00 82 30 dc 50 00 00 01 40 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184240208 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#6 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#6 CDB: opcode=0x88 88 00 00 00 00 00 82 30 d8 50 00 00 04 00 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184239184 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#7 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#7 CDB: opcode=0x88 88 00 00 00 00 00 82 30 d7 08 00 00 01 48 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184238856 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#8 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#8 CDB: opcode=0x88 88 00 00 00 00 00 82 30 d3 08 00 00 04 00 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184237832 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] tag#9 CDB: opcode=0x88 88 00 00 00 00 00 82 30 cf 08 00 00 04 00 00 00 Oct 20 18:09:01 Tower kernel: print_req_error: I/O error, dev sdp, sector 2184236808 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] Read Capacity(16) failed: Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] Sense not available. Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] Read Capacity(10) failed: Result: hostbyte=0x04 driverbyte=0x00 Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] Sense not available. Oct 20 18:09:01 Tower kernel: sd 13:0:5:0: [sdp] 0 512-byte logical blocks: (0 B/0 B) I'm beginning to think the chances of 3 of my drives failing all within 1-2 days is too coincidental... there must be something else going on... I've attached the output of my diagnostics tower-diagnostics-20181020-1829.zip Quote Link to comment
JorgeB Posted October 20, 2018 Share Posted October 20, 2018 Disk5 dropped offline, it's on a SAS2LP (or similar) and those are known to drop disks, can't see SMART, you'll need to reboot, but if the other disks that "failed" were using it also you should replace it with an LSI. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.