djvj Posted April 1, 2019 Share Posted April 1, 2019 Been running these drives (and hardware) for well over a couple years now. Suddenly just started getting tons of write errors equally on 2 drives. This probably isn't the hard drives then write? Should I be looking at either the controller or cables first? Quote Link to comment
JorgeB Posted April 1, 2019 Share Posted April 1, 2019 Please post the diagnostics: Tools -> Diagnostics Quote Link to comment
djvj Posted April 1, 2019 Author Share Posted April 1, 2019 storinator-diagnostics-20190401-1020.zip Here you go. Quote Link to comment
JorgeB Posted April 1, 2019 Share Posted April 1, 2019 It appears to be a problem with the rocketRAID controller, there were timeouts in several if not all, of the disks, for example: Mar 31 04:28:12 Storinator kernel: r750:[01:00 00] Start Soft Reset for 0/0 Mar 31 04:28:12 Storinator kernel: r750:[01:00 03] Start Soft Reset for 0/3 Mar 31 04:28:12 Storinator kernel: r750:[01:00 16] Start Soft Reset for 0/4 Mar 31 04:28:13 Storinator kernel: r750:[01:00 P3] Asyn Notification Received Mar 31 04:28:21 Storinator kernel: r750:[01:00 13] Device request(1c) timeout. Mar 31 04:28:21 Storinator kernel: r750:HIM_EVENT_DEVICE_TIMEOUT vd 00000000f0b5df3b Mar 31 04:28:21 Storinator kernel: r750:[ ] Cdb [88, 0, 0, 0, 0, 0, 3,67, 87,10, 0, 0, 0,80, 0, 0]. Mar 31 04:28:21 Storinator kernel: r750:[ ] H2D FIS(Slot:1c): 00258127 40678710 00000003 00000080 Mar 31 04:28:23 Storinator kernel: r750:[01:00 P3] Asyn Notification Received Mar 31 04:28:23 Storinator kernel: r750:[01:00 P3] GSCR changed Mar 31 04:28:33 Storinator kernel: r750:[01:00 12] Start Soft Reset for 3/0 Mar 31 04:28:33 Storinator kernel: r750:[01:00 13] Start Soft Reset for 3/1 ### [PREVIOUS LINE REPEATED 1 TIMES] ### Mar 31 04:28:34 Storinator kernel: r750:[01:00 P3] Asyn Notification Received Mar 31 04:28:42 Storinator kernel: r750:[01:00 13] Device request(39) timeout. Mar 31 04:28:42 Storinator kernel: r750:HIM_EVENT_DEVICE_TIMEOUT vd 00000000f0b5df3b Mar 31 04:28:42 Storinator kernel: r750:[ ] Cdb [88, 0, 0, 0, 0, 0, 3,67, c7,38, 0, 0, 0,80, 0, 0]. Mar 31 04:28:42 Storinator kernel: r750:[ ] H2D FIS(Slot:39): 00258127 4067c738 00000003 00000080 Mar 31 04:28:44 Storinator kernel: r750:[01:00 P3] Asyn Notification Received Mar 31 04:28:44 Storinator kernel: r750:[01:00 P3] GSCR changed Mar 31 04:28:53 Storinator kernel: r750:[01:00 12] Start Soft Reset for 3/0 Mar 31 04:28:53 Storinator kernel: r750:[01:00 13] Start Soft Reset for 3/1 And those two ended up being dropped: Mar 31 04:35:22 Storinator kernel: r750:[01:00 09] Reset Phase 2 failed for 1/1 Mar 31 04:35:22 Storinator kernel: r750:[01:00 09] disk removed (0). Mar 31 04:35:22 Storinator kernel: r750:[01:00 10] Start Soft Reset for 2/2 Mar 31 04:35:22 Storinator kernel: r750:[01:00 10] Request failed. Error information 0x90800000 Mar 31 04:35:22 Storinator kernel: r750:[ ] H2D FIS: 00000227 00000000 00000000 04000000 Mar 31 04:35:22 Storinator kernel: r750:[01:00 10] Reset Phase 2 failed for 2/1 Mar 31 04:35:22 Storinator kernel: r750:[01:00 10] disk removed (0). A reboot should fix it for now, though you'll need to rebuild the disks, but if it happens again you might want to consider replacing the controller with one of the recommended LSI HBAs. 1 Quote Link to comment
djvj Posted April 2, 2019 Author Share Posted April 2, 2019 (edited) I figured it wasn't the drives, but replacing an $800 controller is a much more costly venture, I'd rather it be bad drives. It's been working fine since 2014. Was the only one I could find that supported all the ports I needed. Possibly one of the ports failed then. I'll reboot and test. Thank you. Edited April 2, 2019 by djvj Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.